Search code examples
javascriptnode.jsencodingcharacter-encoding

How to convert UTF8 to EUC-JP on the browser?


I need to convert UTF8 to EUC-JP which is the keyword in the URL in HTTP request.

Like '那賀崎ゆきね' to %c6%e1%b2%ec%ba%ea%a4%e6%a4%ad%a4%cd

I find that TextDecoder can't convert to EUC-JP, but the result is still UTF8.

This just converts EUC-JP to UTF8

const decoder = new TextDecoder('euc-jp');
const decodedString = decoder.decode(eucjpBytes); 

console.log(decodedString);

Or Encoding.js, encoding-japanese or iconv-lite from Node.js, needs the local environment.

But it seems there is no package to be required to achieve this on the browser.

const url = 'https://xxx'
const encoded = 'xxx';
url = url + encoded;
const Http = new XMLHttpRequest();
Http.open("GET", url);
Http.send();
Http.onload = function(e) {
         let domNewx = new DOMParser().parseFromString(Http.responseText, 'text/html');

}

Or

const url = 'https://xxx'
const encoded = 'xxx';
url = url + encoded;
GM_xmlhttpRequest({
            method: 'GET',
            url: url,
            anonymous: true,
            onload: function (result) {
                xhrResult = result.status;
                let domNewx = new DOMParser().parseFromString(result.responseText, 'text/html');


            },
            onerror: function (result) {
                console.log(result);
            },
});

Solution

  • You don't need a local environment. You can use a CDN with encoding-japanese.

    Add this line to the head tag (or from another CDN):

    <script src="https://cdn.jsdelivr.net/npm/[email protected]/encoding.min.js"></script>
    

    Then you can encode and decode the string like this (standalone) in the browser:

    function utf8ToEucjp(utf8String) {
        const unicodeArray = Encoding.stringToCode(utf8String);
        const sjisArray = Encoding.convert(unicodeArray, {
            to: 'EUC-JP',
            from: 'UNICODE',
        });
        return Encoding.urlEncode(sjisArray);
    }
    
    
    function eucjpToUtf8(eucjpString) {
        const decoded = Encoding.urlDecode(eucjpString);
        const unicodeArray = Encoding.convert(decoded, {
            to: 'UNICODE',
            from: 'EUC-JP',
        });
        return Encoding.codeToString(unicodeArray);
    }
    
    const encoded = utf8ToEucjp('那賀崎ゆきね');
    console.log('encoded:', encoded);
    console.log('decoded:', eucjpToUtf8(encoded));
    

    Output:

    encoded: %C6%E1%B2%EC%BA%EA%A4%E6%A4%AD%A4%CD
    decoded: 那賀崎ゆきね