Search code examples
c#performancestreamxmlreadermemorystream

XmlReader(Stream(fileName) -> MemoryStream) slower than XmlReader(fileName)


Please forgive my bad english.

I want to read large XML files (> 2GB). I saw several posts about it and figured out to use XmlReader.

For test purposes, I have created a 500MB XML and make 2 code :

First one :

MemoryStream mem = new MemoryStream();
Stream file = File.OpenRead(ofd.FileName);
file.CopyTo(mem);
mem.Position = 0;
file.Close();
XmlReader reader = XmlReader.Create(mem);
// work with reader

Second one:

XmlReader reader = XmlReader.Create(ofd.FileName);
// work with reader

ofd.FileName : is the name of path of the xml file.

work with reader : is the same in the two algorithms.

The speed of my RAM is : 15GB/sec The speed of my ssd is : 150MB/sec

I thought that the first algorithm will be faster by at least 100 time. But in real, the second algorithm is faster.

First algorithm duration : 10500 milliseconds.

Second algorithm duration : 9500 milliseconds.

Why ? Is it because the program should cross over several abstract layer in the first algorithm ?

Thank you for any information.


Solution

  • XmlReader is a forward-only reader, so with the MemoryStream approach you're going through the entire file exactly twice.

    Even though the second run is directly from memory, you've already had the "disk" penalty when prebuffering, so the overhead is simply running over all data again.