Search code examples
pythonpython-2.7binaryhex

Two's complement of Hex number in Python


Below a and b (hex), representing two's complement signed binary numbers. For example:

a = 0x17c7cc6e
b = 0xc158a854

Now I want to know the signed representation of a & b in base 10. Sorry I'm a low level programmer and new to python; feel very stupid for asking this. I don't care about additional library's but the answer should be simple and straight forward. Background: a & b are extracted data from a UDP packet. I have no control over the format. So please don't give me an answer that would assume I can change the format of those varibles before hand.

I have converted a & b into the following with this:

aBinary = bin(int(a, 16))[2:].zfill(32) => 00010111110001111100110001101110 => 398969966
bBinary = bin(int(b, 16))[2:].zfill(32) => 11000001010110001010100001010100 => -1051154348

I was trying to do something like this (doesn't work):

if aBinary[1:2] == 1:
aBinary = ~aBinary + int(1, 2)

What is the proper way to do this in python?


Solution

  • You'll have to know at least the width of your data. For instance, 0xc158a854 has 8 hexadecimal digits so it must be at least 32 bits wide; it appears to be an unsigned 32 bit value. We can process it using some bitwise operations:

    In [232]: b = 0xc158a854
    
    In [233]: if b >= 1<<31: b -= 1<<32
    
    In [234]: b
    Out[234]: -1051154348L
    

    The L here marks that Python 2 has switched to processing the value as a long; it's usually not important, but in this case indicates that I've been working with values outside the common int range for this installation. The tool to extract data from binary structures such as UDP packets is struct.unpack; if you just tell it that your value is signed in the first place, it will produce the correct value:

    In [240]: s = '\xc1\x58\xa8\x54'
    
    In [241]: import struct
    
    In [242]: struct.unpack('>i', s)
    Out[242]: (-1051154348,)
    

    That assumes two's complement representation; one's complement (such as the checksum used in UDP), sign and magnitude, or IEEE 754 floating point are some less common encodings for numbers.