I recently registered event handlers for unknown nodes, elements and attributes with the XMLSerializer I use to deserialize complex types from a type hierarchy. I did this because some of the XML I receive is from third parties; I am interested in data format changes which could cause trouble on my end.
In the XML the XMLSerializer produces it uses the standard XML attribute xsi:type="somederivedtypename"
to identify the actual derived type represented by an XML element.
I was surprised to see that the same serializer treats that very same attribute it just produced as unknown upon deserialization. Interestingly though, the deserialization is correct and complete (also with more complicated types and data in my real-world program). That means that the serializer evaluates the type information properly during an early stage in the deserialization. But during a later data-extraction stage the attribute is apparently mistaken for a true data part of the object, which is of course unknown.
In my application the gratuitous warnings end up cluttering a general purpose log file which is undesired. In my opinion the serializer should read back the XML it produced without hiccups. My questions:
A minimal example is here:
using System;
using System.IO;
using System.Xml.Serialization;
namespace XsiTypeAnomaly
{
/// <summary>
/// A trivial base type.
/// </summary>
[XmlInclude(typeof(DerivedT))]
public class BaseT{}
/// <summary>
/// A trivial derived type to demonstrate a serialization issue.
/// </summary>
public class DerivedT : BaseT
{
public int anInt { get; set; }
}
class Program
{
private static void serializer_UnknownAttribute
( object sender,
XmlAttributeEventArgs e )
{
Console.Error.WriteLine("Warning: Deserializing "
+ e.ObjectBeingDeserialized
+ ": Unknown attribute "
+ e.Attr.Name);
}
private static void serializer_UnknownNode(object sender, XmlNodeEventArgs e)
{
Console.Error.WriteLine("Warning: Deserializing "
+ e.ObjectBeingDeserialized
+ ": Unknown node "
+ e.Name);
}
private static void serializer_UnknownElement(object sender, XmlElementEventArgs e)
{
Console.Error.WriteLine("Warning: Deserializing "
+ e.ObjectBeingDeserialized
+ ": Unknown element "
+ e.Element.Name);
}
/// <summary>
/// Serialize, display the xml, and deserialize a trivial object.
/// </summary>
/// <param name="args"></param>
static void Main(string[] args)
{
BaseT aTypeObj = new DerivedT() { anInt = 1 };
using (MemoryStream stream = new MemoryStream())
{
var serializer = new XmlSerializer(typeof(BaseT));
// register event handlers for unknown XML bits
serializer.UnknownAttribute += serializer_UnknownAttribute;
serializer.UnknownElement += serializer_UnknownElement;
serializer.UnknownNode += serializer_UnknownNode;
serializer.Serialize(stream, aTypeObj);
stream.Flush();
// output the xml
stream.Position = 0;
Console.Write((new StreamReader(stream)).ReadToEnd() + Environment.NewLine);
stream.Position = 0;
var serResult = serializer.Deserialize(stream) as DerivedT;
Console.WriteLine(
(serResult.anInt == 1 ? "Successfully " : "Unsuccessfully ")
+ "read back object");
}
}
}
}
Output:
<?xml version="1.0"?>
<BaseT xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:type="DerivedT">
<anInt>1</anInt>
</BaseT>
Warning: Deserializing XsiTypeAnomaly.DerivedT: Unknown node xsi:type
Warning: Deserializing XsiTypeAnomaly.DerivedT: Unknown attribute xsi:type
Successfully read back object
Am I doing something wrong?
I don't think so. I share your opinion that XmlSerializer ought to deserialize its own output without any warnings. Also, xsi:type
is a standard attribute defined in the XML Schema specification, and obviously it is supported by XmlSerializer, as demonstrated by your example and documented in MSDN Library.
Therefore, this behavior simply looks like an oversight. I can imagine a group of Microsoft developers working on different aspects of XmlSerializer during the development of the .NET Framework, and not ever testing xsi:type
and events at the same time.
That means that the serializer evaluates the type information properly during an early stage in the deserialization. But during a later data-extraction stage the attribute is apparently mistaken for a true data part of the object, which is of course unknown.
Your observation is correct.
The XmlSerializer class generates a dynamic assembly to serialize and deserialize objects. In your example, the generated method that deserializes instances of DerivedT looks something like this:
private DerivedT Read2_DerivedT(bool isNullable, bool checkType)
{
// [Code that uses isNullable and checkType omitted...]
DerivedT derivedT = new DerivedT();
while (this.Reader.MoveToNextAttribute())
{
if (!this.IsXmlnsAttribute(this.Reader.Name))
this.UnknownNode(derivedT);
}
this.Reader.MoveToElement();
// [Code that reads child elements and populates derivedT.anInt omitted...]
return derivedT;
}
The deserializer calls this method after it reads the xsi:type
attribute and decides to create an instance of DerivedT. As you can see, the while
loop raises the UnknownNode event for all attributes except xmlns
attributes. That's why you get the UnknownNode (and UnknownAttribute) event for xsi:type
.
The while
loop is generated by the internal XmlSerializationReaderILGen.WriteAttributes method. The code is rather complicated, but I see no code path that would cause xsi:type
attributes to be skipped (other than the second workaround I describe below).
Is there a workaround?
I would just ignore UnknownNode and UnknownAttribute events for xsi:type
:
private static void serializer_UnknownNode(object sender, XmlNodeEventArgs e)
{
if (e.NodeType == XmlNodeType.Attribute &&
e.NamespaceURI == XmlSchema.InstanceNamespace && e.LocalName == "type")
{
// Ignore xsi:type attributes.
}
else
{
// [Log it...]
}
}
// [And similarly for UnknownAttribute using e.Attr instead of e...]
Another (hackier, IMO) workaround is to map xsi:type
to a dummy property in the BaseT class:
[XmlInclude(typeof(DerivedT))]
public class BaseT
{
[XmlAttribute("type", Namespace = XmlSchema.InstanceNamespace)]
[DebuggerBrowsable(DebuggerBrowsableState.Never)] // Hide this useless property
public string XmlSchemaType
{
get { return null; } // Must return null for XmlSerializer.Serialize to work
set { }
}
}