Search code examples
c#xmllinqbenchmarkingxmlreader

What is faster in xml parsing: elements or attributes?


I am writing code that parses XML.

I would like to know what is faster to parse: elements or attributes.

This will have a direct effect over my XML design.

Please target the answers to C# and the differences between LINQ and XmlReader.

Thanks.


Solution

  • With XML, speed is dependent on a lot of factors.

    With regards to attributes or elements, pick the one that more closely matches the data. As a guideline, we use attributes for, well, attributes of an object; and elements for contained sub objects.

    Depending on the amount of data you are talking about using attributes can save you a bit on the size of your xml streams. For example, <person id="123" /> is smaller than <person><id>123</id></person> This doesn't really impact the parsing, but will impact the speed of sending the data across a network wire or loading it from disk... If we are talking about thousands of such records then it may make a difference to your application.

    Of course, if that actually does make a difference then using JSON or some binary representation is probably a better way to go.

    The first question you need to ask is whether XML is even required. If it doesn't need to be human readable then binary is probably better. Heck, a CSV or even a fixed-width file might be better.

    With regards to LINQ vs XmlReader, this is going to boil down to what you do with the data as you are parsing it. Do you need to instantiate a bunch of objects and handle them that way or do you just need to read the stream as it comes in? You might even find that just doing basic string manipulation on the data might be the easiest/best way to go.

    Point is, you will probably need to examine the strengths of each approach beyond just "what parses faster".