Search code examples
webformsdiacriticsasp.net-4.5html-encode

ASP.NET 4.5 Webforms - HTML Encoding gone wrong


I have trouble with an ASP.NET 4.5 webforms application and the new <%#: syntax to do automatic HTML encoding.

I have a formview like this:

<asp:FormView runat="server" ID="FormViewCompany" ItemType="FormViewEncoding.Model.Company" DefaultMode="ReadOnly"
    SelectMethod="Select">
    <ItemTemplate>
        <em>Company</em>
        <br/><br/>
        <asp:Label runat="server" ID="lblName" Text="Name" Width="200px"/>
        <asp:TextBox runat="server" ID="tbxName" Text="<%#: Item.Name %>" Width="600px" />
        <br/>
        <asp:Label runat="server" ID="lblAddress" Text="Address" Width="200px"/>
        <asp:TextBox runat="server" ID="tbxAddress" Text="<%#: Item.Address %>" Width="600px" />
        <br/>
        <asp:Label runat="server" ID="lblZipAndCity" Text="Zip/City" Width="200px"/>
        <asp:TextBox runat="server" ID="tbxZip" Text="<%#: Item.ZipCode %>" Width="75px" />
        &nbsp;
        <asp:TextBox runat="server" ID="tbxCity" Text="<%#: Item.City %>" Width="525px"/>
        <br/>
    </ItemTemplate>    
</asp:FormView>

and when I load a company with German umlauts in its name and address from my SQL Server database (simplified to just return a new instance of an object here):

public class Company
{
    public string Name { get; set; }
    public string Address { get; set; }
    public string ZipCode { get; set; }
    public string City { get; set; }
}

public Company Select()
{
    return new Company
    {
        Name = "Müller & Söhne AG",
        Address = "Haupstrasse 14",
        ZipCode = "5600",
        City = "Münchenstein"
    };
}

I get this output which is NOT good!

enter image description here

WHY are the German umlauts like ü encoded into #252; ??

This makes no sense to me..... I was hoping to use the new <%#: data binding syntax to do automatic HTML encoding to prevent inadvertently rendering any malicious code - but if all German umlauts get "mangled" in the process, too, I get absolutely not use that feature which is really a pity!


Solution

  • I believe it is because the text is being encoded twice.

    Take the following line as an example:

    <asp:TextBox runat="server" ID="tbxCity" Text="<%#: Item.City %>" Width="525px"/>
    

    This will effectively become:

    <asp:TextBox runat="server" ID="tbxCity" Text="M&#252;nchenstein" Width="525px"/>
    

    This is due to encoding done by using <%#: %> syntax.

    The TextBox control itself also does HtmlEncoding of it's Text Value so the TextBox control will output the html encoding of "M&#252;nchenstein" which is "M&amp;#252;nchenstein".

    When M&amp;#252;nchenstein is rendered by the browser you will see M&#252;nchenstein.