Search code examples
javaout-of-memorymemory-mapped-filesxpath-2.0vtd-xml

How to spit the large XML(more than 3GB) using VTD-XML extended


I have to split an xml which is of minimum size of 3GB. We can provide only 1.5GB heap space in 64 bit JVM on Windows OS. I have got example codes all over the Internet using VTDNav only, not with VTDNavHuge. The agenda is to read the above mentioned huge XML and extract a paticular node from it using Xpath and create a new xml with the above extracted content. I am always getting OutOfMemomry exception, though it was mentioned that we can process upto 256GB file also using VTD extended. That is using VTDNavHuge. Please help me with sample code to complete the above mention task under provided development environment. >3GB size file and 1.5GB heap space. I am trying to use memory mapped mode while parsing the file with VTD XML extended.


Solution

  • This is a demonstration of how to use the extended VTD parser to process large XML file. You need 64-bit JVM to take full advantage of extended VTD.

    import com.ximpleware.extended.*;
    public class mem_mapped_read {
        public static void main(String[] s) throws Exception{
            VTDGenHuge vg = new VTDGenHuge();
            if (vg.parseFile("test.xml",true,VTDGenHuge.MEM_MAPPED)){
                VTDNavHuge vnh = vg.getNav();
            AutoPilotHuge aph = new AutoPilotHuge(vnh);
            aph.selectXPath("//*");
            int i = 0;
            while ((i=aph.evalXPath())!=-1){
                System.out.println(" element name is "+vnh.toString(i));
            }
    
            }
        }
    }