Search code examples
c#xmlxml-serializationxmlserializer

Serialize text between XML nodes


I have this XML:

<rootnode>
    Some text <node1>A Name</node1> some more text <node2>A value</node2>
</rootnode>

Whereas the contents is optional, there can be text in front, between or at the end, and node1 and node2 do not need to be present.

I'd like to serialize this XML to the following C# class:

public class RootNode
{
    public String[] Text;
    public Node1Type Node1;
    public Node2Type Node2;
}

Node1 and Node2 can be more complex elements. The Text member should contain the mixed in text-parts.

I've tried using this annotated class:

[XmlRoot( ElementName = "rootnode" )]
public class RootNode
{
    [XmlText]
    public String Text;

    [XmlElement( ElementName = "node1" )]
    public Node1Type Node1;

    [XmlElement( ElementName = "node2" )]
    public Node2Type Node2;
}

However it only captures the text at the beginning.

I serialize using this:

public static T ParseXml<T>( String value ) where T : class {
    var xmlSerializer = new XmlSerializer( typeof( T ) );
    using( var textReader = new StringReader( value ) )
        return (T)xmlSerializer.Deserialize( textReader );
}

How do I capture the whole text?


Solution

  • According to the documentation for XmlTextAttribute:

    You can apply the XmlTextAttribute to a field or property that returns an array of strings.

    And, in fact, doing so will capture the complete string content of <rootnode>:

    [XmlRoot( ElementName = "rootnode" )]
    public class RootNode
    {
        [XmlText]
        public string [] Text;
    
        [XmlElement( ElementName = "node1" )]
        public Node1Type Node1;
    
        [XmlElement( ElementName = "node2" )]
        public Node2Type Node2;
    }
    

    Working fiddle #1 here showing that the XML is deserialized and re-serialized as:

    <rootnode>
        Some text  some more text <node1>A Name</node1><node2>A value</node2></rootnode>
    

    You may note, however, that the interleaving of text nodes and <nodeX> nodes is not preserved. If that is necessary, you will need to use a polymorphic array of objects as suggested by this answer to Correct XML serialization and deserialization of "mixed" types in .NET by Stefan:

    [XmlRoot( ElementName = "rootnode" )]
    public class RootNode
    {
        [XmlText(typeof(string))]
        [XmlElement( ElementName = "node1", Type = typeof(Node1Type) )]
        [XmlElement( ElementName = "node2", Type = typeof(Node2Type) )]
        public object [] nodes;
    }
    

    You must mark the array with XML serialization attributes indicating all possible types that could occur therein.

    Working fiddle #2 here showing that the XML is deserialized and re-serialized as:

    <rootnode>
        Some text <node1>A Name</node1> some more text <node2>A value</node2></rootnode>