Search code examples
performancewindows-phone-7binaryxml

What does it mean to convert an XML document to binary?


I am having a performance issue with XDocument.Load("large_file.xml"), where it takes about 25 seconds to load the file.

I read in this question that using a binary format could offer up to a 10x performance increase.

What does a binary format look like? How do you go about converting an XML file to it?


Solution

  • Lets start with the implied question:

    Q: What is a Binary format?

    A: It is a format in which data is represented in a non-textual form. For example, a Java int might be represented as 4 bytes, rather than a sequence of decimal digits and a sign.

    Q: What does it look like?

    A: If you view it with a text editor / viewer, it looks like garbage.

    Q: How do you go about converting an XML file to a binary form?

    A: By hand. Since a binary format is essentially a format (any format) that is not text, there is no magical method of converting it.

    Q: How and why is a binary format faster?

    A: A binary format isn't automatically faster to load than XML (or JSON). The idea is that you (the programmer) design a specific binary format for your application that will be faster to load. You typically do this by such things as:

    • avoiding the inclusion of verbose / repetitive structuring information (e.g. XML tag and attribute name),
    • using data encodings that require less CPU effort to turn into the in-memory representations,
    • avoiding the inclusion of unnecessary metadata,
    • avoiding things that require extra in-memory data copying,
    • and so on.