Search code examples
delphicomsaxmsxml6

How to combine large XML files using MSXML SAX in Delphi


Edit: My (incomplete and very rough) XmlLite header translation is available on GitHub

What is the best way to do a simple combine of massive XML documents in Delphi with MSXML without using DOM? Should I use the COM components SAXReader and XMLWriter and are there any good examples?

The transformation is a simple combination of all the Contents elements from the root (Container) from many big files (60MB+) to one huge file (~1GB).

<Container>
    <Contents />
    <Contents />
    <Contents />
</Container>

I have it working in the following C# code using an XmlWriter and XmlReaders, but it needs to happen in a native Delphi process:

var files = new string[] { @"c:\bigFile1.xml", @"c:\bigFile2.xml", @"c:\bigFile3.xml", @"c:\bigFile4.xml", @"c:\bigFile5.xml", @"c:\bigFile6.xml" };

using (var writer = XmlWriter.Create(@"c:\HugeOutput.xml", new XmlWriterSettings{ Indent = true }))
{
    writer.WriteStartElement("Container");

    foreach (var inputFile in files)
        using (var reader = XmlReader.Create(inputFile))
        {
            reader.MoveToContent();
            while (reader.Read())
                if (reader.IsStartElement("Contents"))
                    writer.WriteNode(reader, true);
        }

    writer.WriteEndElement(); //End the Container element
}

We already use MSXML DOM in other parts of the system and I do not want to add new components if possible.


Solution

  • XmlLite is a native C++ port of xml reader and writer from System.Xml, which provides the pull parsing programming model. It is in-the-box with W2K3 SP2, WinXP SP3 and above. You'll need a Delphi header translation before almost 1-1 mapping from C# to Delphi.