Search code examples
c#xmlserializationxmlserializer

How to create C# class hierarchy for XML with namespace?


I need to deserialize part of XML document into a model. The part in question looks more less like following:

<Root xmlns:c="Ala ma kota">

<!-- ... -->

        <c:Histogram>
            <Data Width="10" Height="20" />
        </c:Histogram>

<!-- ... -->

</Root>

Deserialization is done by capturing the specific XmlNode from XmlDocument (in this case: c:Histogram) and then using its XmlNode.OuterXml as a source for the deserializer (I failed to find a way to deserialize object from an XmlNode).

I have designed classes for this XML like following:


[XmlRoot("Histogram", Namespace = "Ala ma kota")]
public class Histogram 
{
    [XmlElement]
    public Data Data { get; set; }
}

[XmlRoot]
public class Data
{
    [XmlAttribute]
    public int Width { get; set; }

    [XmlAttribute]
    public int Height { get; set; }
}

The deserialization runs without any exceptions, but the deserialized element misses Data (the property is null).

I hooked an handler for XmlSerializer.UnknownNode event and indeed, the <Data ... /> is seen as an "Unknown node".

The only way I managed the deserialization to run was to add the namespace to the Data class:

[XmlRoot("Data", Namespace = "Ala ma kota")]
public class Data
// (...)

And also the XML (note the c:Data instead of Data):

<Root xmlns:c="Ala ma kota">

<!-- ... -->

        <c:Histogram>
            <c:Data Width="10" Height="20" />
        </c:Histogram>

<!-- ... -->

</Root>

In this case the deserialization works like charm.


My questions:

First: what is the default namespace if the root node is defined with a named namespace? (I mean with prefix instead of simply xmlns="...") Can I explicitly define the default namespace, so that the inner model classes would be deserialized correctly?

<Root>      // <- Default namespace is empty
  <Data />    // <- Equal to default namespace
</Root>

<c:Root xmlns:c="Ala ma kota">  // <- Namespace is "Ala ma kota"
    <Data />                    // <- Namespace is ?
</c:Root>

Second: Is there a way to redefine model classes to avoid adding Namespace = and c: everywhere in C# and XML, respectively? I will have multiple model classes and adding those will be cumbersome. I'd like to avoid it if possible.

Third: Is there a way to rip off the namespace from the "inner-root" element so that the deserialization would work without problems?


So far I have worked around the problem by adding an artificial child element:

<Root xmlns:c="Ala ma kota">

<!-- ... -->

        <c:Histogram>
            <Config>
                <Data Width="10" Height="20" /> 
            </Config>
        </c:Histogram>

<!-- ... -->

</Root>

Then instead of deserializing the "inner-root" element, I'm deserializing its first (and only) child.

I'm curious though if this problem can be solved without such workarounds.


Solution

  • Answering your three questions:

    1. What is the default namespace if the root node is defined with a named namespace? As explained in Namespaces in XML 1.0 (Third Edition): 6.2 Namespace Defaulting:

      If there is no default namespace declaration in scope, the namespace name has no value.

      Microsoft's XML implementations all correctly respect this rule.

    2. Is there a way to redefine model classes to avoid adding Namespace = and c: everywhere in C# and XML, respectively?

      When apply XmlRootAttribute.Namespace to your root data model, XmlSerializer will do you a favor and interpret that namespace as applying recursively to all the nested objects in the serialization graph unless you override that default behavior somehow by applying some other XML serialization atttribute. Ways to do that include:

      • Apply [XmlType(Namespace = "")] to the root data model. This indicates that all elements of the root element are in the empty namespace.

      • Apply [XmlElement(Namespace = "")] to Histogram.Data. This indicates that the Data element specifically is in the empty namespace.

      The first option is likely better for you as you explicitly stated you wanted to avoid sprinkling Namespace= everywhere.

    3. To deserialize directly from an XmlElement you may create an XmlNodeReader and pass that to XmlSerializer.Deserialize(XmlReader). However there is a known bug with XmlNodeReader and namespaces so you may need to use ProperXmlNodeReader from this answer by Nathan Baulch to Deserialize object property with StringReader vs XmlNodeReader to work around the issue. With the simple example shown in your question it does not seem necessary, but with your real XML it may be, so you will have to experiment and see.

    Putting all that together, if you modify your data model as follows:

    [XmlRoot("Histogram", Namespace = "Ala ma kota")]
    [XmlType(Namespace = "")]
    public class Histogram 
    {
        public Data Data { get; set; }
    }
    

    And introduce the following extension methods:

    public static partial class XmlSerializationExtensions
    {
        public static T? Deserialize<T>(this XmlElement element, XmlSerializer? serializer = null)
        {
            using (var reader = new ProperXmlNodeReader(element))
                return (T?)(serializer ?? new XmlSerializer(typeof(T))).Deserialize(reader);
        }
    }
    
    public class ProperXmlNodeReader : XmlNodeReader
    {
        // Bug fix from this answer https://stackoverflow.com/a/30115691/3744182
        // By https://stackoverflow.com/users/8799/nathan-baulch
        // To https://stackoverflow.com/questions/30102275/deserialize-object-property-with-stringreader-vs-xmlnodereader
        // You may need to test whether this is still necessary, 
        public ProperXmlNodeReader(XmlNode node) : base(node) {}
        public override string? LookupNamespace(string prefix) => base.LookupNamespace(prefix) is {} ns ? NameTable.Add(ns) : null;
    }
    

    You can select and deserialize <c:Histogram> as follows:

    var element = (XmlElement?)xmlDocument.SelectSingleNode(@".//*[local-name() = 'Histogram']");
    var histogram = element?.Deserialize<Histogram>();
    

    Demo fiddle here.

    That being said, LINQ-to-XML is currently preferred to the old XmlDocument API, so you might consider switching to that API. If you do, the equivalent deserialization extension method would look like:

    public static partial class XmlSerializationExtensions
    {
        public static T? Deserialize<T>(this XContainer element, XmlSerializer? serializer = null)
        {
            using (var reader = element.CreateReader())
                return (T?)(serializer ?? new XmlSerializer(typeof(T))).Deserialize(reader);
        }
    }
    

    And your application code would look like:

    // Load the XDocument somehow
    //var doc = XDocument.Load(fileName);        // Load from a file.
    var doc = XDocument.Parse(outerXmlString); // Or parse from some already loaded string
    
    XNamespace ns = "Ala ma kota";
    var element = doc.Descendants(ns + "Histogram").FirstOrDefault();
    var histogram = element?.Deserialize<Histogram>();
    

    The reader returned by XNode.CreateReader() does not have a corresponding bug with LookupNamespace so there is no need to introduce a fixed version.

    Demo fiddle #2 here.