Search code examples
goxml-parsingtokenizelarge-files

How to implement progress counter when parsing large XML files in Go/Golang with xml.NewDecoder(xmlFile)?


I wrote some code to parse large XML files (>3GB) in go following the example on this website: https://blog.singleton.io/posts/2012-06-19-parsing-huge-xml-files-with-go/

The idea is to create decoder := xml.NewDecoder(xmlFile), then iterate over the file with decoder.Token() and meanwhile inspect all xml.StartElement. Whenever the right element is found, it gets decoded with decoder.DecodeElement().

That works all very well.

What I like to have now is a method to show progress to the user. Something like "x percent of file processed".

I know how to get the file size of the XML: How to get file length in Go?

But how can I get the actual (or relative) position of decoder.Token()?


Solution

  • xml.Decoder has method InputOffset, that return current position. Do you need something else ?