Search code examples
javascriptencodingcharacter-encodingdecoding

Decode escaped Cyrillic with javascript


The problem is as follows. Because the backend I'm working with uses broken pylons 0.9.7 version, which automatically escapes cookies on saving (I'm using Pylons and having issues with response.set_cookie . And I can't update it to a fixed rev) cyrilic url's saved in cookies from a request turn into bizarre escaped strings. For example :

ооо-фронталь.рф

will be saved as

\320\276\320\276\320\276-\321\204\321\200\320\276\320\275\321\202\320\260\320\273\321\214.\321\200\321\204`

I've tried quoting it with urllib before save, but then I'm left with :

'%5C320%5C276%5C320%5C276%5C320%5C276-%5C321%5C204%5C321%5C200%5C320%5C276%5C320%5C275%5C321%5C202%5C320%5C260%5C320%5C273%5C321%5C214.%5C321%5C200%5C321%5C204'

which doesn't actually make things better in any way. Is there any way to decode this with javascript ? encode/decodeURI dosen't work in this case :/


Solution

  • They are UTF-8 octal escapes so will be hard to convert in JavaScript.

    He is a way that may work, although its pretty terrible:

    As hex \320\276 is 0xD0 0xBE so URL Encoded is %D0%BE so:

    var s = "\\320\\276\\320\\276\\320\\276-\\321\\204\\321\\200\\320\\276\\320\\275\\321\\202\\320\\260\\320\\273\\321\\214.\\321\\200\\321\\204"
    
    var r = s.replace(/\\(\d{3})?/g, function(a, b) {
        //octal to hex
        return "%" + parseInt(b, 8).toString(16);
    });
    
    alert( decodeURIComponent(r) );