Search code examples
c#.netxmlnamespacesdatacontractserializer

Should I use a Namespace of an XML file to identify its version


I'm using DataContractSerializer to serialize a class with DataContract and DataMember attributes to an XML file. My class could potentially change later, and thus the format of the serialized files could also change. I'd like to tag the files I'm saving with a version number so I at least know what version each file is from. I'm still deciding how and if I want to add functionality that will migrate files in older formats to later formats. But right now I'd be happy with just identifying a version mismatch.

Is the namespace of the XML file the correct place to store the version of the file? I was thinking of attributing my class with a DataContract attributes as follows.

[DataContract(Name="MyClass",Namespace="http://www.mycompany.com/MyProject/1.0
public class MyClass
    ...

Then later if MyClass changes I would change the namespace...

[DataContract(Name="MyClass",Namespace="http://www.mycompany.com/MyProject/2.0)]
public class MyClass
    ...

Is this the correct usage of XML namespaces, or is there another more prefered way to save the version of an XML file?


Solution

  • You can do it this way, but then the XML representation of your data becomes completely different from version to version from XML Infoset point of view (in which namespace is the part of the qualified name of the element), so you have neither backwards nor forwards compatibility.

    Now, one advantage XML has is that it can be easily processed in a forward-compatible way with technologies such as XPath and XSLT - you just pick the elements you can interpret, and leave anything you don't recognize as is. But this requires elements with the same meaning to retain the same name (including namespace) between versions.

    In general, it is best to make your schemas forward-compatible. If you can't achieve that, you might still want to provide as much compatibility as possible with existing tools (it is often easier to achieve compatibility against tools which only read data, rather than with those which also write it). Consequently, you avoid storing version number in such cases, and just try to parse whatever you're given, signalling an error if the input is definitely malformed.

    If you come to the point where you absolutely must break compatibility in both directions and start from a clean slate, the suggested way of handling this for WCF data contracts is indeed by changing the namespace, as described in best practices on data contract versioning. There are a few minor variations there as well, such as using publication date instead of version number in the URL (W3C is quite fond of this for their schemas), but these are mostly stylistic.