Search code examples
javaxmleclipse-emfxmlbeans

Partially load a xml file with XMLBeans or EMF


currently i'm using EMf to read ~400 xml files. Each file has about 100.000 lines and consists of descriptive Data (~10%, something like IDs and reference to other elements) and real Data (~90%, long strings/texts).

My Problem is when i read all files i get OutOfMemoryExceptions. My idea to solve this: only load the IDs etc. and if the user tries to access data that is currently not loaded it will be loaded in the background.

Any idea on how to achieve this with EMF or XMLBeans?

edit:

my XML has this structure:

<A>
 <B>
  <C></C>
  <C></C>
 </B>
 <B>
  <C></C>
 </B>
</A>

I want to load the root node in any case. In this example i want to skip the nodes C so that my Object tree looks like this

A
|-B
\-B

Solution

  • For large XML files, you're much better off using a streaming XML parser instead of one that reads the whole file in at once and builds a DOM from it. The latest and greatest way to do that is using StaX (Streaming API for XML) from Sun/Oracle. You also may have heard about SAX.