Search code examples
.netxmlencodingxmlwriter

XmlWriter encoding issues


I have the following code:

    MemoryStream ms = new MemoryStream();
    XmlWriter w = XmlWriter.Create(ms);

    w.WriteStartDocument(true);
    w.WriteStartElement("data");

    w.WriteElementString("child", "myvalue");

    w.WriteEndElement();//data
    w.Close();
    ms.Close();

    string test = UTF8Encoding.UTF8.GetString(ms.ToArray());

The XML is generated correctly; however, my problem is the first character of the string 'test' is ï (char #239), making it invalid to some xml parsers: where is this coming from? What exactly am I doing incorrectly?

I know I can resolve the issue by just starting after the first character, but I'd rather know why it's there than simply patching over the problem.

Thanks!


Solution

  • Found one solution here: https://timvw.be/2007/01/08/generating-utf-8-with-systemxmlxmlwriter/

    I was missing this at the top:

    XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
    xmlWriterSettings.Encoding = new UTF8Encoding(false);
    MemoryStream ms = new MemoryStream();
    XmlWriter w = XmlWriter.Create(ms, xmlWriterSettings);
    

    Thanks for the help everyone!