Search code examples
regexreplaceantixsslibrary

Regex to fix GetSafeHtmlFragment x_ prefix


When using Sanitizer.GetSafeHtmlFragment from Microsoft's AntiXSSLibrary 4.0, I noticed it changes my HTML fragment from:

<pre class="brush: csharp">
</pre>

to:

<pre class="x_brush: x_csharp">
</pre>

Sadly their API doesn't allow us to disable this behavior. Therefore I'd like to use a regular expression (C#) to fix and replace strings like "x_anything" to "anything", that occur inside a class="" attribute.

Can anyone help me with the RegEx to do this?

Thanks

UPDATE - this worked for me:

 private string FixGetSafeHtmlFragment(string html)
        {
            string input = html;
            Match match = Regex.Match(input, "class=\"(x_).+\"", RegexOptions.IgnoreCase);

            if (match.Success)
            {
                string key = match.Groups[1].Value;
                return input.Replace(key, "");
            }
            return html;
        }

Solution

  • Im not 100% sure about the C# @(Verbatim symbol) but I think this should match x_ inside of any class="" and replace it with an empty string:

    string input = 'class="x_something"';
    Match match = Regex.Match(input, @'class="(x_).+"',
        RegexOptions.IgnoreCase);
    
    if (match.Success)
    {
        string key = match.Groups[1].Value;
        string v = input.Replace(key,"");
    }