Search code examples
bashbinaryfiles

Reading a binary file into variable in bash


I have the following bash script. I expect the files out.1 and out.2 to be the same, but they are different. I suspect that the problem is how bash deals with binary files. So what is the proper way to read in a binary file into a variable in bash?

curl -s http://cacerts.digicert.com/DigiCertSHA2HighAssuranceServerCA.crt > out.1
A=`curl -s http://cacerts.digicert.com/DigiCertSHA2HighAssuranceServerCA.crt`
echo "$A" >  out.2
diff out.1 out.2

Solution

  • bash variables (and environment variables, and unix arguments, and...) are not binary-safe. The biggest problem is that they cannot contain zero bytes (i.e. the ASCII NUL character), since that's a string terminator. There are also problems with newlines being removed/added at the end in some situations, and some versions of echo treat backslash characters as escapes that it needs to interpret. Basically, the answer is: don't try to store binary data in the shell.

    But you can convert the data to a non-binary format (hex, base64, uuencode, whatever), and store, pass, etc data in that form. Just be sure to convert formats wherever appropriate. Here's an example of using base64:

    $ curl -s http://cacerts.digicert.com/DigiCertSHA2HighAssuranceServerCA.crt > out.1
    $ a=$(curl -s http://cacerts.digicert.com/DigiCertSHA2HighAssuranceServerCA.crt | base64)
    $ echo "$a" | base64 -d >out.2
    $ diff -s out.*
    Files out.1 and out.2 are identical
    

    BTW, I recommend using lower- or mixed-case variable names (there are a bunch of all-caps variables with special meanings, and using one of those by accident can have weird effects), and also using $( ) instead of backticks (easier to read, and avoids some obscure syntactic oddities).