Search code examples
javascriptcharacter-encodinghtml-entitieshtml-encode

Javascript equivalent for converting character entities


I have the following html snippet with javascript embedded one-lined.

<div class="social_icon twitter" onclick="shareSocialWall('twitter','New Comment on xxxx - WTF, if the dancers don&amp;acirc;&amp;#128;&amp;#153;t come in until 4ish','https:/xxxx')"></div>

As you can see, theres an onclick event thats firing shareSocialWall. What's important here is the text dancers don&amp;acirc;&amp;#128;&amp;#153;t.

When the text gets passed to the function shareSocialWall, heres what happens:

 location='https://twitter.com/intent/tweet?text='+text+'&url='+encodeURIComponent(url);

The problem is that the text is breaking the call to perform this tweet. Is there a way to encode this text inside the javascript so it does not break the text.

The path for this text is:

- perl => encodes entities on comment text as xml node
- xslt => passes xml text to javascript function (uses disable output encoding which does nothing for this apparently)
- js => handles the text inside the shareSocialWall Function

Solution

  • Your text seems to have been mangled already, probably incorrectly double encoded.

    You could just remove all HTML entities with a regex:

    var str = 'New Comment on xxxx - WTF, if the dancers don&amp;acirc;&amp;#128;&amp;#153;t come in until 4ish';
    console.log(str.replace(/&[^\s]*;/g, ''));

    Note that this is pretty aggressive and might remove more text than you wish if your text is really badly encoded, with missing ; for instance.