Search code examples
phppythonhexpacknibble

PHP to python pack('H')


I'm translating an authentication library, written in PHP to Python. It's all legacy code, the original devs long gone. They used PHP's 'pack' command to transform a string into hex using the 'H' flag. PHP's documentation describes this as 'Hex string, high nibble first'. I read another question (Python equivalent of php pack) which suggested using binascii.unhexlify(), but this complains whenever I pass in a non-hex character.

So my question is what does the PHP pack function do with non-hex characters? Does it throw them away, or is there an extra step that performs a translation. Is there a better method in Python than binascii.unhexlify?

So pack'ing 'H*'

php -r 'print pack("H*", md5("Dummy String"));' 

Returns

??????=?PW??

In python:

secret = binascii.unhexlify( "Dummy String" )
TypeError: Non-hexadecimal digit found

Thanks for the help.

[EDIT]

So DJV was fundamentally right. I needed to convert the value into md5 first, however that's where it's interesting. In python the md5 library returns binary data via the 'digest' method.

In my case I could skip all the binascii calls and just use

md5.md5('Dummy String').digest()

Which is the same in PHP as:

pack("H*", md5("Dummy String"));

Fun stuff. Good to know.


Solution

  • I think you need it the other way around. "Dummy String" is not a valid number in hex. You can hexlify it:

    >>> binascii.hexlify('Dummy String')
    '44756d6d7920537472696e67'
    

    but not unhexlify it. unhexlify takes a string in hex and turns it into it's ASCII representation:

    >>> binascii.unhexlify('44756d6d7920537472696e67')
    'Dummy String'
    

    What you need is to md5 the string ("Dummy String" in our case) and unhexlify it's hash:

    import binascii
    import hashlib
    
    the_hash = hashlib.md5('Dummy String').hexdigest()
    print the_hash
    the_unhex = binascii.unhexlify(the_hash)
    print the_unhex
    

    Which yields the hash, and the unhexlified hash:

    ec041da9f891c09b3d1617ba5057b3f5
    ЛLЬ-ю?=¦PWЁУ
    

    Note: although the output doesn't look exactly like yours - "??????=?PW??", the "PW" and "=" in both, makes me pretty certain it's correct.

    More on hashlib and binascii