Search code examples
javapythonxmlparsingsax

Efficient Parser for large XMLs


I have very large XML files to process. I want to convert them to readable PDFs with colors, borders, images, tables and fonts. I don't have a lot of resources in my machine, thus, I need my application to be very optimal addressing memory and processor.

I did a humble research to make my mind about the technology to use but I could not decide what is the best programming language and API for my requirements. I believe DOM is not an option because it consumes a lot of memory, but, would Java with SAX parser fulfill my requirements?

Some people also recommended Python for XML parsing. Is it that good?

I would appreciate your kind advice.


Solution

  • SAX is very good parser but it is outdated.

    Recently Oracle have launched new Parser to parse the xml files efficiently called Stax

    *http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.6/tutorial/doc/SJSXP2.html*

    Attached link will also shows comparisons of all parsers along with memory utilization and its features.

    Thanks, Pavan