I need to parse a huge xml file on server and send it to client.
I want to do the parsing on demand - meaning, to only parse and show the parent nodes at first, and when the client clicks on a parent node - to send a request to the server that tells which parent was selected, and just then to parse and send its children (again, not the whole sub-tree, but just the parents).
I thought about using STAX parser, but I don't understand how to work with it when it comes to parent-children relationship. How do I tell the parser not to continue to the next START-ELEMENT which is the child, but to skip to the next parent in its level? and also - is there a way to go back with the ITERATOR implementation? after choosing one parent and seeing its children, can I go back and see a previous parent?
I would really appreciate any suggestion!
Thank you.
No, you can't skip a sub-tree of an XML document without parsing it first. That is true for every parser, not just StAX. (Knowing which point to skip to implies that you've already parsed the elements in between.)
However by maintaining a nesting level counter that you increment with every start element event and decrement with every end element event, it's easy to ignore all the events that come from a level below your target level.
Parsing is one way, not random access, you can't jump back and forth. (Again, this would assume that the parser stores a representation of everything parsed so far, which is exactly what StAX was created to avoid.) But of course you can try to record the byte position of each parent tag in the file, then later seek to it if you've got the file open for random access. There are quite a few pitfalls to this approach though.
All in all, your use case doesn't look like a good fit for StAX. Have you tried VTD-XML? Depending on how big your file is, it can be exactly what you want.