Search code examples
cunixbinaryprintfhexdump

Print binary representation of file


I need a function which will print out the binary representation of a read file like the xxd program in unix, but I want to make my own. Hexidecimal works just fine with %x but there is no built in format for binary. Anyone know how to do this?


Solution

  • I usually do not believe in answering these sorts of questions with full code implementations, however I was handed this bit of code many years ago and I feel obligated to pass it on. I have removed all the comments except for the usage, so you can try to figure out how it works yourself.

    Code base 16

    #include <stdio.h>
    #include <ctype.h>
    
    // Takes a pointer to an arbitrary chunk of data and prints the first-len bytes.
    void dump (void* data, unsigned int len)
    {
      printf ("Size:  %d\n", len);
    
      if (len > 0) {
        unsigned width = 16;
        char *str = (char *)data;
        unsigned int j, i = 0;
    
        while (i < len) {
          printf (" ");
    
          for (j = 0; j < width; j++) {
            if (i + j < len)
              printf ("%02x ", (unsigned char) str [j]);
            else
              printf ("   ");
    
            if ((j + 1) % (width / 2) == 0)
              printf (" -  ");
          }
    
          for (j = 0; j < width; j++) {
            if (i + j < len)
              printf ("%c", isprint (str [j]) ? str [j] : '.');
            else
              printf (" ");
           }
    
           str += width;
           i += j;
    
          printf ("\n");
        }
      }
    }
    


    Output base 16 (Excerpt from first 512 bytes* of a flash video)

    Size:  512
     00 00 00 20 66 74 79 70  -  69 73 6f 6d 00 00 02 00  -  ... ftypisom....
     69 73 6f 6d 69 73 6f 32  -  61 76 63 31 6d 70 34 31  -  isomiso2avc1mp41
     00 06 e8 e6 6d 6f 6f 76  -  00 00 00 6c 6d 76 68 64  -  ....moov...lmvhd
     00 00 00 00 7c 25 b0 80  -  7c 25 b0 80 00 00 03 e8  -  ....|%..|%......
     00 0c d6 2a 00 01 00 00  -  01 00 00 00 00 00 00 00  -  ...*............
     00 00 00 00 00 01 00 00  -  00 00 00 00 00 00 00 00  -  ................
     00 00 00 00 00 01 00 00  -  00 00 00 00 00 00 00 00  -  ................
     00 00 00 00 40 00 00 00  -  00 00 00 00 00 00 00 00  -  ....@...........
     00 00 00 00 00 00 00 00  -  00 00 00 00 00 00 00 00  -  ................
     00 01 00 02 00 01 9f 38  -  74 72 61 6b 00 00 00 5c  -  .......8trak...\
    


    I assume you already know how to tell the size of a file and read a file in binary mode, so I will leave that out of the discussion. Depending on your terminal width you may need to adjust the variable: width -- the code is currently designed for 80 character terminals.

    I am also assuming that when you mentioned xxd in conjunction with "binary" you meant non-text as opposed to base 2. If you want base 2, set width to 6 and replace printf ("%02x ", (unsigned char) str [j]); with this:

    {
      for (int k = 7; k >= 0; k--)
        printf ("%d", ((unsigned char)str [j] >> k) & 1);
      printf (" ");
    }
    

    The required change is pretty simple, you just need to individually shift all 8 bits of your octet and mask off all but the least-significant bit. Remember to do this in an order that seems counter-intuitive at first, since we print left-to-right.

    Output base 2 (Excerpt from first 512 bytes* of a flash video)

    Size:  512
     00000000 00000000 00000000  -  00100000 01100110 01110100  -  ... ft
     01111001 01110000 01101001  -  01110011 01101111 01101101  -  ypisom
     00000000 00000000 00000010  -  00000000 01101001 01110011  -  ....is
     01101111 01101101 01101001  -  01110011 01101111 00110010  -  omiso2
     01100001 01110110 01100011  -  00110001 01101101 01110000  -  avc1mp
     00110100 00110001 00000000  -  00000110 11101000 11100110  -  41....
     01101101 01101111 01101111  -  01110110 00000000 00000000  -  moov..
     00000000 01101100 01101101  -  01110110 01101000 01100100  -  .lmvhd
     00000000 00000000 00000000  -  00000000 01111100 00100101  -  ....|%
     10110000 10000000 01111100  -  00100101 10110000 10000000  -  ..|%..
     00000000 00000000 00000011  -  11101000 00000000 00001100  -  ......
    

    *For the sake of simplicity, let us pretend that a byte is always 8-bits.