Search code examples
c#htmlspecial-characters

Output HTML with special characters (μ) in C#


I am using C# to generate HTML tables with the special character (μ) via the following code:

string text1 = "100 μs";
string text2 = "100 \u00B5s";
string text3 = "100 &#xb5s";

string[] lines = { "<html>", "<body>", text1, text2, text3, "</body>", "</html>" };

string curFile = @"C:\SpecialCharacter.html";

System.IO.File.WriteAllLines(curFile, lines);

And the output HTML's source looks like the following:

<html>
<body>
100 μs
100 µs
100 &#xb5s
</body>
</html>

with the last one actually displaying correctly (100 µs) in a browser. I am wondering if there's a better way to do this. Namely, is there a way to actually see "100 µs" in the HTML source code (and display correctly in a browser)? I know that's possible if I write HTML by-hand, but I am not sure how to achieve that via C#.

Thanks.


Solution

  • By default, File.WriteAllLines encodes text using UTF-8 but omits the Unicode byte order mark, which will confuse some browsers. To help those browsers out, one option is to include the byte order mark by explicitly specifying Encoding.UTF8 as the encoding:

    System.IO.File.WriteAllLines(curFile, lines, Encoding.UTF8);
    

    Another option is to include a <meta> tag that specifies UTF-8 as the encoding:

    string head = "<head><meta http-equiv='Content-Type' content='text/html; charset=UTF-8'></head>";
    // ...
    string[] lines = { "<html>", head, "<body>", text1, text2, text3, "</body>", "</html>" };