My requirement is to compress xml file into a binary format, transmit it and decompress it (lightening fast) before i start parsing it.
There are quite a few binary xml protocols and tools available. I found EXI (efficient xml interchange) better as compared to others. Tried its open source version Exificient and found it good.
I heard about google protocol buffers and facebook's thrift, can any one tell me if these two can do the job i am looking for?
OR just let me know if there is anything better then EXI i should look for.
Also, There is a good XML parser VTD-XML (haven't tried myself, just googled about it and read some articles) that accomplishes better parsing performances as compared to DOM,SAX and Stax.
I want best of both worlds, best compression + best parsing performance, any suggestions?
One more thing regarding EXI, how can EXI claim to be fast at parsing a decoded XML file? Because it is being parsed by DOM, SAX or STax? I would have believed this to be true if there was another binary parser for reading the decoded version. Correct me if i am wrong.
ALSO, is there any good C++ open source implementation for EXI format? A version in java is available by EXIficient, but i am not able to spot a C++ open source implementation?
There is one by agile delta but that's commercial.
You mention protocol buffers (protobuf); this is a binary format, but has no direct relationship to XML. In partiular, no member-names (element names / attribute names / namespaces) are encoded - it is just the data (with numeric markers for identifiers).
As such, you cannot reconstruct arbitrary XML from a protobuf stream unless you already know how to map "field 3" etc.
However! If you have an object-model that works with both XML and protobuf, the transform is trivial; deserialize with either - serialize with either. How well this works depends on the implementation; for example, it is trivial with protobuf-net and is actually how I do the codegen (load the binary; write as XML; run the XML through an xslt layer to emit code).
If you actually just want to transfer object data (and XML is just a proposed implementation detail), then I thoroughly recommend protobuf; platform independent, a wide range of implementations, version-tolerant, very small output, and very fast processing at both read and write.