Search code examples
javascriptjsonnode.jsparsingnon-ascii-characters

JS - JSON.parse - preserve special characters


I'm running a NodeJS app that gets certain posts from an API. When trying to JSON.parse with special characters in, the JSON.parse would fail.

Special characters can be just any other language, emojis etc.

Parsing works fine when posts don't have special characters. I need to preserve all of the text, I can't just ignore those characters since I need to handle every possible language.

I'm getting the following error:

"Unexpected token �"

Example of a text i'm supposed to be able to handle:

"summary": "★リプライは殆ど見てません★ Tokyo-based E-J translator. ここは流れてくるニュースの自分用記録でRT&メモと他人の言葉の引用、ブログのフィード。ここで意見を述べることはしません。「交流」もしません。関心領域は匦"�アイルランドと英国(他は専門外)※Togetterコメ欄と陰謀論が嫌いです。"

How can I properly parse such a text?

Thanks


Solution

  • You have misdiagnosed your problem, it has nothing to do with that character.

    Your code contains an unescaped " immediately before the special character you think is causing the problem. The early " is prematurely terminating the string.

    If you insert a backslash to escape the ", your string can be parsed as JSON just fine:

        x = '{"summary": "★リプライは殆ど見てません★ Tokyo-based E-J translator. ここは流れてくるニュースの自分用記録でRT&メモと他人の言葉の引用、ブログのフィード。ここで意見を述べることはしません。「交流」もしません。関心領域は匦\\"�アイルランドと英国(他は専門外)※Togetterコメ欄と陰謀論が嫌いです。"}';
    
        console.log(JSON.parse(x));