Search code examples
c#loadxmldocument

Load xml from string that contains HTML


I am doing the following -

        // Create message
        StringBuilder sbXML = new StringBuilder();
        sbXML .Append("<root>");
        sbXML .AppendFormat("<messageBody>{0}</messageBody>", JsonString);          
        sbXML .Append("</root>");

Where JsonString is a json string, however some of the entries in the json are strings of html (which I think this is why it is breaking).

When I do -

        XmlDocument xmlDOC = new XmlDocument();
        xmlDOC.LoadXml(sbXML.ToString());

I get the error -

'\' is an unexpected token. The expected token is '"' or '''.

My Json also contains urls so for instance -

{
    "exampleJson": {
        "url":  "http://example.com/",
        "html": "<a href=\"http://example.com\" rel=\"test\">example text</a>"
    }
}

I believe it is these values that is leading to the exception, is there a way around this so that xmlDOC.LoadXml can load my Json, I considered doing something like -

xmlDOC.LoadXml(sbXML.ToString().Replace("character to replace", "acceptable character"));

However this is obviously not ideal. I also tried just using

.Load

However this resulted in illegal characters in the path exception.


Solution

  • I think you want to be doing something like:

            StringBuilder sbXML = new StringBuilder();
            sbXML.Append("<root>");
            sbXML.Append("<messageBody />");
            sbXML.Append("</root>");
    
            XmlDocument xmlDOC = new XmlDocument();
            xmlDOC.LoadXml(sbXML.ToString());
            xmlDOC.DocumentElement.SelectSingleNode("messageBody").InnerText = JsonString;
    

    As pointed out by @Alexei Levenkov creating Xml by string concatenation is a really bad idea and will lead to more problems later.

    Using the System.Xml.XmlDocument methods is a much safer method that will encode all the bits it needs to to make the value of JsonString Xml safe.