Search code examples
c#asp.net.netantixsslibrary

bug in System.Web.Security.AntiXss.AntiXssEncoder.MarkAsSafe and LowerCodeCharts.None?


I did a Microsoft Connect bug submission, I linked this question in my submission.


I'm playing around with the System.Web.Security.AntiXss.AntiXssEncoder.MarkAsSafe to test a few things and I found this.

This code, which seem proper:

protected void Application_Start(object sender, EventArgs e)
{
    System.Web.Security.AntiXss.AntiXssEncoder.MarkAsSafe
    (
        LowerCodeCharts.None,
        LowerMidCodeCharts.None,
        MidCodeCharts.None,
        UpperMidCodeCharts.None,
        UpperCodeCharts.None
    );

    /*2 randoms characters from each pdf linked below */
    System.Diagnostics.Debug.WriteLine(
        System.Web.Security.AntiXss.AntiXssEncoder.HtmlEncode(
               "%* ܼ aW ?? ?? ??", false));

    System.Diagnostics.Debug.WriteLine(
        System.Web.Security.AntiXss.AntiXssEncoder.HtmlEncode(
               "%* ܼ aW ?? ?? ??", true));
}

Doesn't work like it should. The string above are not being encoded.

It seem that it is still use LowerCodeCharts.Default, ignoring my LowerCodeCharts.None, which use these(I found the list of each character set here).

BasicLatin, C1ControlsAndLatin1Supplement, LatinExtendedA, LatinExtendedB, IpaExtensions, SpacingModifierLetters, CombiningDiacriticalMarks


so after looking with ILSpy at System.Web.Security.AntiXss.UnicodeCharacterEncoder

I found this code;

public static void MarkAsSafe(....)
{

if (lowerCodeCharts == UnicodeCharacterEncoder.currentLowerCodeChartSettings && 
    lowerMidCodeCharts == UnicodeCharacterEncoder.currentLowerMidCodeChartSettings && 
    midCodeCharts == UnicodeCharacterEncoder.currentMidCodeChartSettings && 
    upperMidCodeCharts == UnicodeCharacterEncoder.currentUpperMidCodeChartSettings && 
    upperCodeCharts == UnicodeCharacterEncoder.currentUpperCodeChartSettings)
{
    return;
}

  /*actual code that does something*/
}

I looked with ILSpy and the default value is None for all of them

This make my first attempt of MarkAsSafe being void

When I call, with or without my None

System.Diagnostics.Debug.WriteLine(
    System.Web.Security.AntiXss.AntiXssEncoder.HtmlEncode(
           "%* ܼ aW ?? ?? ??", false));

It goes through, by looking at System.Web.Security.AntiXss.UnicodeCharacterEncoder in ILSpy, this

private static void InitialiseSafeList()
{
/*some code */
if (UnicodeCharacterEncoder.characterValues == null)
{
    UnicodeCharacterEncoder.InitialiseSafeList();
}
/*some code */
}

because UnicodeCharacterEncoder.characterValues is null at that point, I verified this with a Watch Window in Visual Studio. The UnicodeCharacterEncoder.InitialiseSafeList method as this line

SafeList.PunchUnicodeThrough(ref UnicodeCharacterEncoder.characterValues, 
                                 LowerCodeCharts.Default, 
                                 LowerMidCodeCharts.None, 
                                 MidCodeCharts.None, 
                                 UpperMidCodeCharts.None, 
                                 UpperCodeCharts.None);

So to make it REALLY None, I have tried to call MarkAsSafe twice, one with fake value to make sure it doesn't void it and then a call it with ALL None

System.Web.Security.AntiXss.AntiXssEncoder.MarkAsSafe
(
LowerCodeCharts.None,
LowerMidCodeCharts.None,
MidCodeCharts.Arrows,      /*    FAKE VALUE to trigger the proper code */
UpperMidCodeCharts.None,
UpperCodeCharts.None);

System.Web.Security.AntiXss.AntiXssEncoder.MarkAsSafe
    (
    LowerCodeCharts.None, /*    NOW THIS WORK      */
    LowerMidCodeCharts.None,
    MidCodeCharts.None,
    UpperMidCodeCharts.None,
    UpperCodeCharts.None);

now i actually get an encoded string

%*ܼąŴƣǼɤʩ˦˿

and yet it is impossible to reset the internal array UnicodeCharacterEncoder.characterValues so in my example since I used MidCodeCharts.Arrows, they are now treated as safe. Impossible to change it back to unsafe.


For fun I wanted to find a way to make it work like it should and this make it work.

Please, this code is not for production use. PLEASE!!

var assembly = Assembly.GetAssembly(typeof(System.Web.Security.AntiXss.AntiXssEncoder));
var type = assembly.GetType("System.Web.Security.AntiXss.UnicodeCharacterEncoder");
var field = type.GetField("currentLowerCodeChartSettings", BindingFlags.Static | BindingFlags.NonPublic);
field.SetValue(null, -1);


System.Web.Security.AntiXss.AntiXssEncoder.MarkAsSafe
(
    System.Web.Security.AntiXss.LowerCodeCharts.None,
    System.Web.Security.AntiXss.LowerMidCodeCharts.None,
    System.Web.Security.AntiXss.MidCodeCharts.None,
    System.Web.Security.AntiXss.UpperMidCodeCharts.None,
    System.Web.Security.AntiXss.UpperCodeCharts.None
);

Please, this code is not for production use. PLEASE!!


So my question is, with the current implementation of the library, is there another way of having all encoding mark as not safe?


Solution

  • It was confirmed as a bug in January and it just got closed as won't fix.

    It seem that if your are hit by this bug, you have to use the NOT RECOMMENDED code in my question.

    Also if you have to change many time at runtime the encoding, you need to find a way to manually reset the internal array. Otherwise the encoding will include any previous "MarkAsSafe" call. This can be done by manipulating private / internal variable which is NOT RECOMMENDED.