Search code examples
javafileserializationformatfile-format

Choosing the right file format


I am working on an application, that should have the possibility to save/load data. Most of the data is stored in instances of a class. The data in the instances consist of:

  • a double[][] array
  • some String, int, bool and some enumerators

I have multiple of those instances + some global data that I want to store in a single file.

So far I save all of it as binary data, using

DataOutputStream out = new DataOutputStream(FileOutputStream(file));
out.writeInt()/writeBoolean()/writeUTF()/etc

This works good, the problem is that it is not very flexible. If I add/remove some variables to my container class, there is no simple way to still have the old format be compatible. I started using a version number that I add at the start of the file. But this results in a big loadData/closeData method for every format version.

Text based files are out of the question, because they use way too much space for my double array.

Do you know a good way to solve this problem? I.e. define a backward compatible format that does not result in a huge amount of code? Any suggestions are appreciated.

An idea that I am thinking of is to mark every variable with an integer identifying that variable. So the format would be [identifier1][variable1(String)][identifier2][variable2(double[][])]....

I also thought of serialization, but I don't have any experience and can not really tell if that is the right way to do it.

Please comment if you need more information about the data or some examples.


Solution

  • http://code.google.com/p/protobuf/ is Google's nice cross-platform (and cross-language) way of storing data, with backwards compatibility already baked in, could give it a try.

    In particular this part of the documentation pertains to your case:

    New fields could be easily introduced, and intermediate servers that didn't need to inspect the data could simply parse it and pass through the data without needing to know about all the fields.