Search code examples
linuxshelliconv

Converting \u003c to < character with linux tools


From an ajax call, I got back something like this:

{"d":"\u003cdiv class=\"popup_title\"\u003eBENTELER Autótechnika Kft.\u003c/div\u003e\u003cdiv style=\"font-size:10pt;font-weight:bold;\"\u003e8060 Mór, Akai út 5.

I' d like to convert it to a "usable" format, so \u0003c will simply be a < character.

The header of the ajax call says that this is an iso-8859-2 coding (content-type: text/plain; charset=iso-8859-2), but I' m unsure.

I tried to use iconv with many options, but no luck.

What is interesting is that for instance this site:

https://www.online-toolz.com/tools/text-unicode-entities-convertor.php

does the trick without anything, I just can' t find out what the "from encoding" should be.

I' d be happy to use iconv.


Solution

  • The character set is simply ASCII. These are escape codes used e.g. by JavaScript (and Python).

    If the value you get from the AJAX call is valid JSON (as presumably it will be), use a JSON tool to extract it.

    bash$ jq -r .d <<\:
    {"d":"\u003cdiv class=\"popup_title\"\u003eBENTELER Autótechnika Kft.\u003c/div\u003e\u003cdiv style=\"font-size:10pt;font-weight:bold;\"\u003e8060 Mór, Akai út 5."}
    :
    <div class="popup_title">BENTELER Autótechnika Kft.</div><div style="font-size:10pt;font-weight:bold;">8060 Mór, Akai út 5.