Search code examples
javascriptunicodedata-uri

Making a data URI from unicode string


I'm trying to make JavaScript download a unicode string as a text file. I'm at the point where I need to convert the unicode string into a data URL, so that the user can open the URL and download the file. Here is a simplification of my code:

var myString = "⌀怴ꁴ㥍䯖챻巏ܛ肜怄셀겗孉贜짥孍ಽ펾曍㩜䝺捄칡⡴얳锭劽嫍ᯕ�";

var link = document.createElement('a');
link.setAttribute('href', 'data:text/plain;base64,' + myString);

I don't know what character set to use or how to encode my string - I've tried combinations of encodeURI() and btoa(), but haven't managed to get anything working. encodeURI() gives me the error Uncaught URI Error: malformed URI for some characters like U+da7b.
I would prefer the final downloaded file to have the same characters as the initial string.


Solution

  • This is working for me

    decodeURIComponent(atob(btoa(encodeURIComponent("中文"))))
    // Output: 中文
    

    And for your case on \uDA7B, it fails because it's one of the high surrogates (D800-DBFF), it is meaningful only when used as part of a surrogate pair.

    That's why you have the URIError when you do

    encodeURIComponent('\uDA7B') // ERROR

    Pair it with a character from the low surrogates (DC00-DFFF) and it works:

    encodeURIComponent('\uDA7B\uDC01')