Modifying XML files with StAX is possible. But what I am trying to understand is -
--> is it possible with StAX to update the XML documents as and when it encounters an event. If it does so, then there is no huge memory footprint which is great .
Example - if i am reading a Customers.xml file and I need to change the State information for each customer from StateName to StateCode. Then When I encounter the content -
<State>California</State>
I want to change it to <State>CA</State>
So with StAX does can this modification to the source file happen immediately after reading <State>California</State>
and only only after that the parser goes ahead with the next customer record.
So when the second customers record is read the first customers state is already updated in the xml to state code.
or
--> does it handle updates by temporarily keeping track of the changes to be made and updates the whole document in a single go after parsing the entire document. In this case i will guess there will be huge memory footprint if there are too many changes to larges documents (say 10GB XML file).
Example continued -
so when second customer is processed StAX knows that state field for the first customer needs to be updated but it defers it until all the customers records are read. It can use some in-memory mechanism to keep track of what needs to be updated in to XML.
You cannot change XML files in-place with StAX, but you can read the file in, write to another file and apply changes on the fly. The modified StAX events (including the changes) are written immediately to the target file (except for internal buffering purposes).
So the size of your XML file or the number of changes doesn't matter.
If your changes depend on other parts in the XML then it becomes more difficult. Then you can process the XML file in two passes. Pass 1 is for collecting all necessary information for the changes and pass 2 is for applying the changes with the information gathered in pass 1. Or you can use a totally different approach like XML databases (e.g. BaseX) and apply your changes with XQuery.