My project at work is using the Jackson JSON serializer to convert a bunch of Java objects into Strings in order to send them to REST services.
Some of these objects contain sensitive data, so I've written custom serializers to serialize these objects to JSON strings, then gzip them, then encrypt them using AES
;
This turns the strings into byte arrays, so I use the Base64 encoder in Apache commons
codec to convert the byte arrays into strings. The custom deserializers behind the REST interfaces reverse this process:
base64 decode -> decrypt -> decompress -> deserialize using default Jackson deserializer.
Base64
encoding increases the size of the output (the gzip step in serialization is meant to help ameliorate this increase), so I checked Google to see if there was a more efficient alternative, which led me to this previous stackoverflow thread that brought up Ascii85 encoding as a more efficient alternative -
Base64
adds 33% to the size of the output, Ascii85
adds 25% to the size of the output.
I found a few Java Ascii85 implementations e.g. Apache pdfbox, but I'm a bit leery to use the encoding - it seems like hardly anybody is using or implementing it, which might just mean that Base64 has more inertia, or which may instead mean that there's some wonky problem with Ascii85.
Does anybody know more on this subject? Are there any problems with Ascii85 that mean that I should use Base64 instead?
Base64 is way more common. The difference in size really isn't that significant in most cases, and if you add at the HTTP level (which will compress the base64) instead of within your payload, you may well find the difference goes away entirely.
Are there any problems with Ascii85 that mean that I should use Base64 instead?
I would strongly advise using base64 just because it's so much more widespread. It's pretty much the canonical way of representing binary data as text (unless you want to use hex, of course).