Search code examples
internationalizationdartdart-html

Dart sanitize international text


How best do I sanitize text like

abc&#39; a>b<c & a<b>c

converting/displaying

abc&#39; a&gt;b&le;c &amp; a&le;b&gt;c

or in clear text

abc' a>b<c & a<b>c

so that I can use it via

myDiv.innerHtml=...   or
myDiv.setInnerHtml(..., myValidator, mySantitizer);

A text assignment myDiv.text=... converts all & and <> eliminating the valid apostrophe &#39; - the HtmlEscape.convert(..) class/method also converts all & in all HtmlEscapeMode's.

Could write my own Sanitizer, but hope that I overlooked some standard library/call.


Solution

  • DartPad Link

    RexExp for HTML Entity

    import 'dart:html';
    import 'dart:convert';
    
    void main() {
      String htmlStr = r'abc&#39; a>b<c & a<b>' * 3;
      var reg = new RegExp(r"(.*?)(&#[1-9][0-9]{1,3}|[A-Za-z][0-9A-Za-z]+;)|(.*)");
      List<Match> matchs = reg.allMatches(htmlStr);
      var resStr = '';
      matchs.forEach((m) {
        var g1 = m.group(1);
        var g2 = m.group(2);
        var g3 = m.group(3);
        g1 = HTML_ESCAPE.convert(g1 == null ? '' : g1);
        g2 = g2 == null ? '' : g2;
        g3 = HTML_ESCAPE.convert(g3 == null ? '' : g3);
        resStr += g1 + g2 + g3;
      });
      print(resStr);
      document.body.setInnerHtml(resStr);
    }