Search code examples
javabase64fileinputstreamapache-commons-codec

Base64 decoding using apache commons codec failing on very large binary file


I am developing an encryption tool, and for our encrypted file format I am using Base64 to encode data. I am using apache commons codec to decode files using a Base64InputStream wrapped around a FileInputStream. This worked like a charm until I tested it on a large music file. For some mysterious reason, when I did this, every byte after and including byte 6028 turned into 0. The code to read it into the byte[] follows:

FileInputStream filein = new FileInputStream(filename);
Base64InputStream in = new Base64InputStream(filein,false,76,'\n');
byte[] contents = new byte[known_and_tested_correct_filelength];
in.read(contents);

Now, for whatever reason, after byte 6028, everything in contents is 0. However, contents.length is around 300,000 bytes. As you can guess, this did wonders for my application. Does anyone have any inkling of what's going on?


Solution

  • The semantics of in.read() is not to read ALL the bytes in the buffer provided, but to read as many as "are ready" and let you know how many that was.

    You must then repeat the call to in.read() for the next chunk and the next etc until you get a -1.

    Your current code just gets the first chunk and you discard the size of the chunk.