Search code examples
c#xmlserializationdatacontractserializer.net-7.0

Easiest way to serialize an IParsable<T> as a string using the DataContract model?


I want to serialize IParsable<MyType> as <mns:MyType>MyType.ToString()</mns:MyType> in the context of a broader serialization requiring the DataContract serialization approach (readonly fields, etc.).

I have tried several ways, including ISerializable and ISerializationSurrogate, but they have all led to a lot of code that looks confusingly different to the DataContract attributes used throughout, and leaves me sure I'm using the wrong tools. It should be easy - MyType casts to and from string. What's the easiest way?


Solution

  • The steps for implementing a data contract serialization surrogate in .NET Core for some type MyType are as follows:

    1. Create a surrogate type that can be serialized via the data contract serializer, and can be converted back and forth from the original type MyType.

    2. create a surrogate provider type ISerializationSurrogateProvider that implements the following methods:

    3. Before serializing or deserializing, construct your own DataContractSerializer and call DataContractSerializerExtensions.SetSerializationSurrogateProvider(provider) to register an instance of the provider from step #2 with the serializer.

    Important notes

    • The method for using data contract surrogates in .NET Framework is completely different from .NET Core. If you are working in .NET Framework, ignore this answer and see Data Contract Surrogates.

    • In your ToString() method, be sure to format all numeric and other values using CultureInfo.InvariantCulture. If you don't, XML generated in one locale will be unreadable in another. If you need a localized ToString() for some reason, make MyType implement IFormattable and use ToString(default, CultureInfo.InvariantCulture) when serializing. (The format string can be ignored.)

    • You mentioned attempting to use ISerializable for your MyType. This technique stores data as element/value pairs in the XML serialization stream. Since your desired XML contains only text and no markup elements, this approach is probably not appropriate.

    With that in mind, we can create a generic surrogate for any type implementing IParsable<TSelf> as follows:

    public sealed class ParsableSurrogate<TSelf> where TSelf : IParsable<TSelf>
    {
        [IgnoreDataMember]
        public TSelf? Value { get; set; }
    
        public string? TextValue { get { return Value?.ToString(); } set { Value = (value == null ? default : TSelf.Parse(value, default)); } }
    }
    

    Next, make the following ISerializationSurrogateProvider:

    public class SerializationSurrogateProvider : ISerializationSurrogateProvider
    {
        readonly Dictionary<Type, (Type SurrogateType, Func<object, Type, object> ToSurrogate)> toSurrogateDictionary = new();
        readonly Dictionary<Type, (Type OriginalType, Func<object, Type, object> ToOriginal)> fromSurrogateDictionary = new();
        
        public Type GetSurrogateType(Type type) =>
            toSurrogateDictionary.TryGetValue(type, out var entry) ? entry.SurrogateType : type;
    
        // TODO: check to see whether null objects are handled correctly
    
        public object GetObjectToSerialize(object obj, Type targetType) =>
            toSurrogateDictionary.TryGetValue(obj.GetType(), out var entry) ? entry.ToSurrogate(obj, targetType) : obj;
    
        public object GetDeserializedObject(object obj, Type targetType) =>
            fromSurrogateDictionary.TryGetValue(obj.GetType(), out var entry) ? entry.ToOriginal(obj, targetType) : obj;
            
        public SerializationSurrogateProvider AddParsable<TParsable>() where TParsable : IParsable<TParsable>
        {
            toSurrogateDictionary.Add(typeof(TParsable), (typeof(ParsableSurrogate<TParsable>), (obj, t) => new ParsableSurrogate<TParsable> { Value = (TParsable)obj }));
            fromSurrogateDictionary.Add(typeof(ParsableSurrogate<TParsable>), (typeof(TParsable), (obj, t) => ((ParsableSurrogate<TParsable>)obj).Value!));
            return this;
        }
    }
    

    Now, let's say your MyType looks like this:

    public class MyType : IParsable<MyType>
    {
        public MyType(string? value1, string? value2) => (this.Value1, this.Value2) = (value1, value2);
        
        public string? Value1 { get; }
        public string? Value2 { get; }
        
        public override string ToString() => JsonSerializer.Serialize(this);
        
        public static MyType Parse (string s, IFormatProvider? provider) => JsonSerializer.Deserialize<MyType>(s) ?? throw new ArgumentException();
        public static bool TryParse (string? s, IFormatProvider? provider, out MyType result) => throw new NotImplementedException("not needed for the question");
    }
    

    Then to serialize from and to XML, you create a DataContractSerializer as follows for any model type that contains an instance of MyType as follows:

    var serializer = new DataContractSerializer(model.GetType());
    serializer.SetSerializationSurrogateProvider(new SerializationSurrogateProvider().AddParsable<MyType>());
    
    var xml = model.ToContractXml(serializer : serializer);
    
    var model2 = DataContractSerializerHelper.FromContractXml<Model>(xml, serializer);
    

    Using the extension methods:

    public static partial class DataContractSerializerHelper
    {
        public static string ToContractXml<T>(this T obj, DataContractSerializer? serializer = null, XmlWriterSettings? settings = null)
        {
            serializer = serializer ?? new DataContractSerializer(obj == null ? typeof(T) : obj.GetType());
            using (var textWriter = new StringWriter())
            {
                settings = settings ?? new XmlWriterSettings { Indent = true };
                using (var xmlWriter = XmlWriter.Create(textWriter, settings))
                {
                    serializer.WriteObject(xmlWriter, obj);
                }
                return textWriter.ToString();
            }
        }
    
        public static T? FromContractXml<T>(string xml, DataContractSerializer? serializer = null)
        {
            using (var textReader = new StringReader(xml ?? ""))
            using (var xmlReader = XmlReader.Create(textReader))
            {
                return (T?)(serializer ?? new DataContractSerializer(typeof(T))).ReadObject(xmlReader);
            }
        }
    }
    

    When MyType is the root model itself, the XML generated will look like the following:

    <ParsableSurrogateOfMyTypeMTRdQN6P xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/">
      <TextValue>{"Value1":"hello","Value2":"there"}</TextValue>
    </ParsableSurrogateOfMyTypeMTRdQN6P>
    

    Demo fiddle #1 here.

    While this works, you will notice a couple problems:

    1. The text value is nested inside a <TextValue> element. In your question, you made it clear you don't want that.

      Since DataContractSerializer does not support an equivalent to the [XmlText] attribute of XmlSerializer, it will be necessary to implement IXmlSerializable.

    2. The root element name, <ParsableSurrogateOfMyTypeMTRdQN6P>, is stupid, and doesn't match your requirement, which is <mns:MyType>.

      Since we will be implementing IXmlSerializable, it will be necessary to use the solution from this answer to How can I control the root element namespace and name when serializing an IXmlSerializable object with the data contract serializer?.

    This requires modifying ParsableSurrogate<TSelf> as follows:

    [XmlSchemaProvider("GetSchemaMethod")]
    public sealed class ParsableSurrogate<TSelf> : IXmlSerializable where TSelf : IParsable<TSelf>
    {
        public TSelf? Value { get; set; }
    
        public void ReadXml(XmlReader reader)
        {
            reader.MoveToContent();
            Value = TSelf.Parse(reader.ReadElementContentAsString(), CultureInfo.InvariantCulture);
        }
        
        public void WriteXml(XmlWriter writer) => writer.WriteValue(Value?.ToString());
        public XmlSchema? GetSchema() => null;
        
        const string DataContractNamespacePrefix = "http://schemas.datacontract.org/2004/07/";
    
        static void GetDataContractNamespaceAndName(Type type, out string name, out string @namespace)
        {
            // TODO: tweak this logic as required to get the correct namespace and name
            (name, @namespace) = (type.Name, DataContractNamespacePrefix + type.Namespace);
            if (type.GetCustomAttribute<DataContractAttribute>() is {} attr)
                (name, @namespace) = (attr.Name ?? name, attr.Namespace ?? @namespace);
        }
        
        // This is the method named by the XmlSchemaProviderAttribute applied to the type.
        public static XmlQualifiedName GetSchemaMethod(XmlSchemaSet xs)
        {
            GetDataContractNamespaceAndName(typeof(TSelf), out var name, out var @namespace);
            // Create the TSelf type as a restriction of string
            var tSelfType = new XmlSchemaSimpleType
            {
                Name = name,
                Content = new XmlSchemaSimpleTypeRestriction { BaseTypeName = XmlSchemaType.GetBuiltInSimpleType(XmlTypeCode.String).QualifiedName },
            };
            // Create the top-level TSelf element.
            var tSelfElement = new XmlSchemaElement
            {
                Name = name,
                SchemaTypeName = new XmlQualifiedName(tSelfType.Name, @namespace),
            };
            // Create a schema with the type & element
            var tSelfSchema = new XmlSchema
            {
                TargetNamespace = @namespace,
                // Add the top-level element and types to the schema
                Items = { tSelfElement, tSelfType },
            };
            xs.Add(tSelfSchema);
            return new XmlQualifiedName(name, @namespace);
        }
    }
    

    And adding the required data contract namespace to MyType like so:

    [DataContract(Namespace = "https://stackoverflow.com/questions/75912029")] // Replace with whatever namespace you need
    public class MyType : IParsable<MyType>
    {
    

    Having done this, the following XML will be generated for MyType:

    <MyType xmlns="https://stackoverflow.com/questions/75912029">{"Value1":"hello","Value2":"there"}</MyType>
    

    And for a model that contains a list of MyType such as the following:

    public class Model
    {
        public List<MyType?> MyTypes { get; set; } = new ();
    }
    

    The following XML is generated, as required:

    <Model xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/">
      <MyTypes xmlns:d2p1="https://stackoverflow.com/questions/75912029">
        <d2p1:MyType>{"Value1":"hello","Value2":"there"}</d2p1:MyType>
        <d2p1:MyType i:nil="true" />
        <d2p1:MyType>{"Value1":"foo","Value2":"bar"}</d2p1:MyType>
      </MyTypes>
    </Model>
    

    Demo fiddle #2 here.