I have to create an custom file format (byte-based) for an Android-Application. The formats main purpose is to save an AES encrypted file (byte data), and some metadata, which is needed to decrypt it (such as IV, Salt and several application settings).
I have several questions on how to design and implement this:
The current idea is to start off with 4 Bytes of a Magic Number, then Version Number of the format. This is followed by the IV and Salt. Then i would include a checksum of the first 4kb of the original (unencrypted) data, so i can quickly decrypt just the first 4kb and check if the key provided was the correct one. Then a checksum of the whole original (unencrypted) data so i can check the whole file too. This is it for the header. (Do i need length of ((un)encrypted) data? Offset to data? A checksum for the whole (header + body) file?)
For the body (which is now encrypted) i would like to add the original file name and the extension (how much bytes should be used for this?). Then the original file.
The two main methods i found are ByteArrayOutputStreams and RandomAccessFiles. With the first option i am missing the seek option, like how is it possible to write at a specific position (i.e. for the checksum)? The second one seems to work well, but maybe there are better solutions available.
For my H2 database, I implemented a file system abstraction, with multiple filesystem implementations, including an encrypted file. There are many other file system implementations, for example a cache wrapper, and so on.
I would use XTS (XEX-based tweaked-codebook mode with ciphertext stealing), this is what I implemented. It allows random access reads and writes, and is not much slower than pure AES.
The header you suggest sounds good to me: magic number, then version number of the format. I combined magic number and version number (a different version results in a different magic number). With XTS, an IV is not needed. Salt, I would use plenty, for example 8 bytes. I also stored the hash iterations to hash the password, I used PBKDF2 for that. I think using something like PBKDF2 is important.
I made the header 4096 bytes long, to match the block size of regular file systems. This should improve performance if you read and write with a fixed block size. I didn't use any checksum, as my underlying (unencrypted) file has a checksum. I think that's good enough, and maybe a bit more secure than to store an unencrypted checksum of the unencrypted data, but I'm not sure.
As for the API, using ByteBuffer
or byte[]
is both fine. With ByteBuffer
, supporting memory mapped files is simpler.