Search code examples
javascripthtmlnode.jsdecode

A plain JavaScript way to decode HTML entities, works on both browsers and Node


How to decode HTML entities like   ' to its original character?

In browsers we can create a DOM to do the trick (see here) or we can use some libraries like he

In NodeJS we can use some third party lib like html-entities

What if we want to use plain JavaScript to do the job?

There are many similar questions and useful answers in stackoverflow but I can't find a way works both on browsers and Node.js. So I'd like to share my opinion.

I have posted my opinion as an answer below. I hope it can be a helping hand for someone. :)


Solution

  • There are many similar questions and useful answers in stackoverflow but I can't find a way works both on browsers and Node.js. So I'd like to share my opinion.

    For html codes like   < > ' and even Chinese characters.

    I suggest to use this function. (Inspired by some other answers)

    function decodeEntities(encodedString) {
        var translate_re = /&(nbsp|amp|quot|lt|gt);/g;
        var translate = {
            "nbsp":" ",
            "amp" : "&",
            "quot": "\"",
            "lt"  : "<",
            "gt"  : ">"
        };
        return encodedString.replace(translate_re, function(match, entity) {
            return translate[entity];
        }).replace(/&#(\d+);/gi, function(match, numStr) {
            var num = parseInt(numStr, 10);
            return String.fromCharCode(num);
        });
    }
    

    This implement also works in Node.js environment.

    decodeEntities("&#21704;&#21704;&nbsp;&#39;&#36825;&#20010;&#39;&amp;&quot;&#37027;&#20010;&quot;&#22909;&#29609;&lt;&gt;") //哈哈 '这个'&"那个"好玩<>

    As a new user, I only have 1 reputation :(

    I can't make comments or answers to existing posts so that's the only way I can do for now.

    Edit 1

    I think this answer works even better than mine. Although no one gave him up vote.