Search code examples
pythonpython-3.xhexdump

Pythonic way to hex dump files


Is there any way to code in a pythonic way this Bash command?

hexdump -e '2/1 "%02x"' file.dat

Obviously, without using os.popen, or any such shortcut ;)

It would be great if the code was functional in Python3.x


Solution

  • If you only care about Python 2.x, line.encode('hex') will encode a chunk of binary data into hex. So:

    with open('file.dat', 'rb') as f:
        for chunk in iter(lambda: f.read(32), b''):
            print chunk.encode('hex')
    

    (IIRC, hexdump by default prints 32 pairs of hex per line; if not, just change that 32 to 16 or whatever it is…)

    If the two-argument iter looks baffling, click the help link; it's not too complicated once you get the idea.

    If you care about Python 3.x, encode only works for codecs that convert Unicode strings to bytes; any codecs that convert the other way around (or any other combination), you have to use codecs.encode to do it explicitly:

    with open('file.dat', 'rb') as f:
        for chunk in iter(lambda: f.read(32), b''):
            print(codecs.encode(chunk, 'hex'))
    

    Or it may be better to use hexlify:

    with open('file.dat', 'rb') as f:
        for chunk in iter(lambda: f.read(32), b''):
            print(binascii.hexlify(chunk))
    

    If you want to do something besides print them out, rather than read the whole file into memory, you probably want to make an iterator. You could just put this in a function and change that print to a yield, and that function returns exactly the iterator you want. Or use a genexpr or map call:

    with open('file.dat', 'rb') as f:
        chunks = iter(lambda: f.read(32), b'')
        hexlines = map(binascii.hexlify, chunks)