Please forgive my bad english.
I want to read large XML files (> 2GB). I saw several posts about it and figured out to use XmlReader.
For test purposes, I have created a 500MB XML and make 2 code :
First one :
MemoryStream mem = new MemoryStream();
Stream file = File.OpenRead(ofd.FileName);
file.CopyTo(mem);
mem.Position = 0;
file.Close();
XmlReader reader = XmlReader.Create(mem);
// work with reader
Second one:
XmlReader reader = XmlReader.Create(ofd.FileName);
// work with reader
ofd.FileName : is the name of path of the xml file.
work with reader : is the same in the two algorithms.
The speed of my RAM is : 15GB/sec The speed of my ssd is : 150MB/sec
I thought that the first algorithm will be faster by at least 100 time. But in real, the second algorithm is faster.
First algorithm duration : 10500 milliseconds.
Second algorithm duration : 9500 milliseconds.
Why ? Is it because the program should cross over several abstract layer in the first algorithm ?
Thank you for any information.
XmlReader
is a forward-only reader, so with the MemoryStream
approach you're going through the entire file exactly twice.
Even though the second run is directly from memory, you've already had the "disk" penalty when prebuffering, so the overhead is simply running over all data again.