Search code examples
javascripthtmlstringescapingnon-ascii-characters

Escape html & non-ascii chars with javascript


OK, so I need to replace all <, & and > plus all non-ascii characters with their html-entity counterparts. I've tried Underscore.string.escapeHTML but that didn't seem to touch the non-ascii chars.

For example I need this:

<div>föö bär</div>

converted into this:

&lt;div&gt;f&ouml;&ouml; b&auml;r&lt;/div&gt;

Obviously auml and ouml are not enough. I need a valid ascii string no matter what buttons the users chooses to push, or heaven forbit, even writes with some moonspeak keyboard.


Solution

  • I found what you are looking for here

    For the purpose of your needs you have to use htmlEncode function.

    They define a number of other useful functions within the object:

    HTML2Numerical: Converts HTML entities to their numerical equivalents.

    NumericalToHTML: Converts numerical entities to their HTML equivalents.

    numEncode: Numerically encodes unicode characters.

    htmlDecode: Decodes HTML encoded text to its original state.

    htmlEncode: Encodes HTML to either numerical or HTML entities. This is determined by the EncodeType property.

    XSSEncode: Encodes the basic characters used in XSS attacks to malform HTML.

    correctEncoding: Corrects any double encoded ampersands.

    stripUnicode: Removes all unicode characters.

    hasEncoded: Returns true if a string contains html encoded entities within it.

    Source: www.strictly-software.com

    Beware of the license agreement - GPL, The MIT License (MIT)