In one of my C# projects I use a WCF data contract serializer for serialization to XML. The framework however consists of multiple extension modules that may be loaded or not, dependent on some startup configuration (I use MEF in case it matters). In the future the list of modules may potentially grow and I fear that this situation may someday pose problems with module-specific data. As I understand I can implement a data contract resolver to bidirectionally help the serializer locate types, but what happens if the project contains data it cannot interpret because the associated module is not loaded?
I am looking for a solution that allows me to preserve existing serialized data in cases where not the full set of modules is loaded (or even available). I think of this as a way to tell the de-serializer "if you don't understand what you get, then don't try to serialize it, but please keep the data somewhere so that you can put it back when serializing the next time". I think my problem is related to round-tripping, but I wasn't very successful (yet) in finding a hint on how to deal with such a case where complex types may be added or removed between serialization actions.
Minimal example: Suppose I start my application with the optional modules A, B and C and produce the following XML (AData, BData and CData are in a collection and may be all derived from a common base class):
<Project xmlns="http://schemas.datacontract.org/2004/07/TestApplication" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Data>
<ModuleData i:type="AData">
<A>A</A>
</ModuleData>
<ModuleData i:type="BData">
<B>B</B>
</ModuleData>
<ModuleData i:type="CData">
<C>C</C>
</ModuleData>
</Data>
</Project>
In case I skip module C (containing the definition of CData) and load the same project, then the serializer fails because it has no idea how to deal with CData. If I can somehow manage to convince the framework to keep the data and leave it untouched until someone opens the project again with module C, then I win. Of course I could implement dynamic data structures for storing extension data, e.g., key-value trees, but it would be neat to use the existing serialization framework also in extension modules. Any hint on how to achieve this is highly appreciated!
The example code to produce the above output is as follows:
using System;
using System.IO;
using System.Collections.Generic;
using System.Runtime.Serialization;
namespace TestApplication
{
// common base class
[DataContract]
public class ModuleData : IExtensibleDataObject
{
public virtual ExtensionDataObject ExtensionData { get; set; }
}
[DataContract]
public class AData : ModuleData
{
[DataMember]
public string A { get; set; }
}
[DataContract]
public class BData : ModuleData
{
[DataMember]
public string B { get; set; }
}
[DataContract]
public class CData : ModuleData
{
[DataMember]
public string C { get; set; }
}
[DataContract]
[KnownType(typeof(AData))]
[KnownType(typeof(BData))]
public class Project
{
[DataMember]
public List<ModuleData> Data { get; set; }
}
class Program
{
static void Main(string[] args)
{
// new project object
var project1 = new Project()
{
Data = new List<ModuleData>()
{
new AData() { A = "A" },
new BData() { B = "B" },
new CData() { C = "C" }
}
};
// serialization; make CData explicitly known to simulate presence of "module C"
var stream = new MemoryStream();
var serializer1 = new DataContractSerializer(typeof(Project), new[] { typeof(CData) });
serializer1.WriteObject(stream, project1);
stream.Position = 0;
var reader = new StreamReader(stream);
Console.WriteLine(reader.ReadToEnd());
// deserialization; skip "module C"
stream.Position = 0;
var serializer2 = new DataContractSerializer(typeof(Project));
var project2 = serializer2.ReadObject(stream) as Project;
}
}
}
I also uploaded a VS2015 solution here.
Your problem is that you have a polymorphic known type hierarchy, and you would like to use the round-tripping mechanism of DataContractSerializer
to read and save "unknown" known types, specifically XML elements with an xsi:type
type hint referring to a type not currently loaded into your app domain.
Unfortunately, this use case simply isn't implemented by the round-tripping mechanism. That mechanism is designed to cache unknown data members inside an ExtensionData
object, provided that the data contract object itself can be successfully deserialized and implements IExtensibleDataObject
. Unfortunately, in your situation the data contract object cannot be constructed precisely because the polymorphic subtype is unrecognized; instead the following exception gets thrown:
System.Runtime.Serialization.SerializationException occurred
Message="Error in line 4 position 6. Element 'http://www.Question45412824.com:ModuleData' contains data of the 'http://www.Question45412824.com:CData' data contract. The deserializer has no knowledge of any type that maps to this contract. Add the type corresponding to 'CData' to the list of known types - for example, by using the KnownTypeAttribute attribute or by adding it to the list of known types passed to DataContractSerializer."
Even if I try to create a custom generic collection marked with [CollectionDataContract]
that implements IExtensibleDataObject
to cache items with unrecognized contracts, the same exception gets thrown.
One solution is to take advantage of the fact that your problem is slightly less difficult than the round-tripping problem. You (the software architect) actually know all possible polymorphic subtypes. Your software does not, because it isn't always loading the assemblies that contain them. Thus what you can do is load lightweight dummy types instead of the real types when the real types aren't needed. As long as the dummy types implement IExtensibleDataObject
and have the same data contract namespace and name and the real types, their data contracts will be interchangeable with the "real" data contracts in polymorphic collections.
Thus, if you define your types as follows, adding a Dummies.CData
dummy placeholder:
public static class Namespaces
{
// The data contract namespace for your project.
public const string ProjectNamespace = "http://www.Question45412824.com";
}
// common base class
[DataContract(Namespace = Namespaces.ProjectNamespace)]
public class ModuleData : IExtensibleDataObject
{
public ExtensionDataObject ExtensionData { get; set; }
}
[DataContract(Namespace = Namespaces.ProjectNamespace)]
public class AData : ModuleData
{
[DataMember]
public string A { get; set; }
}
[DataContract(Namespace = Namespaces.ProjectNamespace)]
public class BData : ModuleData
{
[DataMember]
public string B { get; set; }
}
[DataContract(Namespace = Namespaces.ProjectNamespace)]
[KnownType(typeof(AData))]
[KnownType(typeof(BData))]
public class Project
{
[DataMember]
public List<ModuleData> Data { get; set; }
}
[DataContract(Namespace = Namespaces.ProjectNamespace)]
public class CData : ModuleData
{
[DataMember]
public string C { get; set; }
}
namespace Dummies
{
[DataContract(Namespace = Namespaces.ProjectNamespace)]
public class CData : ModuleData
{
}
}
You will be able to deserialize your Project
object using either the "real" CData
or the "dummy" version, as shown with the test below:
class Program
{
static void Main(string[] args)
{
new TestClass().Test();
}
}
class TestClass
{
public virtual void Test()
{
// new project object
var project1 = new Project()
{
Data = new List<ModuleData>()
{
new AData() { A = "A" },
new BData() { B = "B" },
new CData() { C = "C" }
}
};
// serialization; make CData explicitly known to simulate presence of "module C"
var extraTypes = new[] { typeof(CData) };
var extraTypesDummy = new[] { typeof(Dummies.CData) };
var xml = project1.SerializeXml(extraTypes);
ConsoleAndDebug.WriteLine(xml);
// Demonstrate that the XML can be deserialized with the dummy CData type.
TestDeserialize(project1, xml, extraTypesDummy);
// Demonstrate that the XML can be deserialized with the real CData type.
TestDeserialize(project1, xml, extraTypes);
try
{
// Demonstrate that the XML cannot be deserialized without either the dummy or real type.
TestDeserialize(project1, xml, new Type[0]);
Assert.IsTrue(false);
}
catch (AssertionFailedException ex)
{
Console.WriteLine("Caught unexpected exception: ");
Console.WriteLine(ex);
throw;
}
catch (Exception ex)
{
ConsoleAndDebug.WriteLine(string.Format("Caught expected exception: {0}", ex.Message));
}
}
public void TestDeserialize<TProject>(TProject project, string xml, Type[] extraTypes)
{
TestDeserialize<TProject>(xml, extraTypes);
}
public void TestDeserialize<TProject>(string xml, Type[] extraTypes)
{
var project2 = xml.DeserializeXml<TProject>(extraTypes);
var xml2 = project2.SerializeXml(extraTypes);
ConsoleAndDebug.WriteLine(xml2);
// Assert that the incoming and re-serialized XML are equivalent (no data was lost).
Assert.IsTrue(XNode.DeepEquals(XElement.Parse(xml), XElement.Parse(xml2)));
}
}
public static partial class DataContractSerializerHelper
{
public static string SerializeXml<T>(this T obj, Type [] extraTypes)
{
return obj.SerializeXml(new DataContractSerializer(obj == null ? typeof(T) : obj.GetType(), extraTypes));
}
public static string SerializeXml<T>(this T obj, DataContractSerializer serializer)
{
serializer = serializer ?? new DataContractSerializer(obj == null ? typeof(T) : obj.GetType());
using (var textWriter = new StringWriter())
{
var settings = new XmlWriterSettings { Indent = true };
using (var xmlWriter = XmlWriter.Create(textWriter, settings))
{
serializer.WriteObject(xmlWriter, obj);
}
return textWriter.ToString();
}
}
public static T DeserializeXml<T>(this string xml, Type[] extraTypes)
{
return xml.DeserializeXml<T>(new DataContractSerializer(typeof(T), extraTypes));
}
public static T DeserializeXml<T>(this string xml, DataContractSerializer serializer)
{
using (var textReader = new StringReader(xml ?? ""))
using (var xmlReader = XmlReader.Create(textReader))
{
return (T)(serializer ?? new DataContractSerializer(typeof(T))).ReadObject(xmlReader);
}
}
}
public static class ConsoleAndDebug
{
public static void WriteLine(object s)
{
Console.WriteLine(s);
Debug.WriteLine(s);
}
}
public class AssertionFailedException : System.Exception
{
public AssertionFailedException() : base() { }
public AssertionFailedException(string s) : base(s) { }
}
public static class Assert
{
public static void IsTrue(bool value)
{
if (value == false)
throw new AssertionFailedException("failed");
}
}
Another solution would be to replace your List<ModuleData>
with a custom collection that implements IXmlSerializable
and handles the polymorphic serialization entirely manually, caching the XML for unknown polymorphic subtypes in a list of unknown elements. I wouldn't recommend that however since even straightforward implementations of IXmlSerializable
can be quite complex, as shown here and, e.g., here.