Before saying anything I have to say that, albeit I'm an experienced programmer in Java, I'm rather new to C / C++ programming.
I have to save a binary file in a format that makes it accessible from different operating systems & platforms. It should be very efficient because I have to deal with a lot of data. What approaches should I investigate for that? What are the main advantages and disadvantages?
Currently I'm thinking about using the network notation (something like htonl
that is available both under unix and windows ). Is there a better way?
I think there are a couple of decent choices for this kind of task.
In most cases, my first choice would probably be Sun's (now Oracle's) XDR. It's used in Sun's implementation of RPC, so it's been pretty heavily tested for quite a while. It's defined in RFC 1832, so documentation is widely available. There are also libraries (portable and otherwise) that know how to convert to/from this format. On-wire representation is reasonably compact and conversion fairly efficient.
The big potential problem with XDR is that you do need to know what the data represents to decode it -- i.e., you have to (by some outside means) ensure that the sender and receiver agree on (for example) the definition of the structs they'll send over the wire, before the receiver can (easily) understand what's being sent.
If you need to create a stream that's entirely self-describing, so somebody can figure out what it contains based only on the content of the stream itself, then you might consider ASN.1. It's crufty and nasty in some ways, but it does produce self-describing streams, is publicly documented, and it's used pretty widely (though mostly in rather limited domains). There are a fair number of libraries that implement encoding and decoding. I doubt anybody really likes it much, but if you need what it does, it's probably the first choice, if only because it's already known and somewhat accepted.