I need to send a large amount of data via XML and my Docker container runs out of memory when performing the task. Is there a way using Go to incrementally marshal a large XML document and also incrementally write it to a file so as to minimize memory usage?
Use xml.Encoder
to stream the XML output to an io.Writer
that may be a network connection (net.Conn
) or a file (os.File
). The complete result will not be kept in memory.
You may use Encoder.Encode()
to encode a Go value to XML. Generally you may pass any Go value that you would pass to xml.Marshal()
.
Encoder.Encode()
only helps if the data you want to marshal is ready in-memory, which may or may not be feasible to you. E.g. if you want to marshal a large list which cannot (or should not) be read into memory, this will not be a salvation to you.
If the input data also cannot be held in memory, then you may construct the XML output by tokens and elements. You may use Encoder.EncodeToken()
for this which allows you to write "parts" of the result XML document.
For example if you want to write a large list to the output, you may write a start element tag (e.g. <list>
), then write the elements of the list one-by-one (each fetched from database or from file, or constructed by an algorithm on-the-fly), and once the list is marshaled, you may close the list element tag (</list>
).
Here's a simple example how you can do that:
type Student struct {
ID int
Name string
}
func main() {
he := func(err error) {
if err != nil {
panic(err) // In your app, handle error properly
}
}
// For demo purposes we use an in-memory buffer,
// but this may be an os.File too.
buf := &bytes.Buffer{}
enc := xml.NewEncoder(buf)
enc.Indent("", " ")
he(enc.EncodeToken(xml.StartElement{Name: xml.Name{Local: "list"}}))
for i := 0; i < 3; i++ {
// Here you can fetch / construct the records
he(enc.Encode(Student{ID: i, Name: string(i + 'A')}))
}
he(enc.EncodeToken(xml.EndElement{Name: xml.Name{Local: "list"}}))
he(enc.Flush())
fmt.Println(buf.String())
}
Output of the above is (try it on the Go Playground):
<list>
<Student>
<ID>0</ID>
<Name>A</Name>
</Student>
<Student>
<ID>1</ID>
<Name>B</Name>
</Student>
<Student>
<ID>2</ID>
<Name>C</Name>
</Student>
</list>