Search code examples
javascriptstringutf-8arraybuffer

Conversion between UTF-8 ArrayBuffer and String


I have an ArrayBuffer which contains a string encoded using UTF-8 and I can't find a standard way of converting such ArrayBuffer into a JS String (which I understand is encoded using UTF-16).

I've seen this code in numerous places, but I fail to see how it would work with any UTF-8 code points that are longer than 1 byte.

return String.fromCharCode.apply(null, new Uint8Array(data));

Similarly, I can't find a standard way of converting from a String to a UTF-8 encoded ArrayBuffer.


Solution

  • function stringToUint(string) {
        var string = btoa(unescape(encodeURIComponent(string))),
            charList = string.split(''),
            uintArray = [];
        for (var i = 0; i < charList.length; i++) {
            uintArray.push(charList[i].charCodeAt(0));
        }
        return new Uint8Array(uintArray);
    }
    
    function uintToString(uintArray) {
        var encodedString = String.fromCharCode.apply(null, uintArray),
            decodedString = decodeURIComponent(escape(atob(encodedString)));
        return decodedString;
    }
    

    I have done, with some help from the internet, these little functions, they should solve your problems! Here is the working JSFiddle.

    EDIT:

    Since the source of the Uint8Array is external and you can't use atob you just need to remove it(working fiddle):

    function uintToString(uintArray) {
        var encodedString = String.fromCharCode.apply(null, uintArray),
            decodedString = decodeURIComponent(escape(encodedString));
        return decodedString;
    }
    

    Warning: escape and unescape is removed from web standards. See this.