Search code examples
pythonendiannesssign

Read 32-bit signed value from an "unsigned" bytestream


I want to extract data from a file whoose information is stored in big-endian and always unsigned. How does the "cast" from unsigned int to int affect the actual decimal value? Am I correct that the most left bit decides about the whether the value is positive or negative?

I want to parse that file-format with python, and reading and unsigned value is easy:

def toU32(bits):
    return ord(bits[0]) << 24 | ord(bits[1]) << 16 | ord(bits[2]) << 8  | ord(bits[3])

but how would the corresponding toS32 function look like?


Thanks for the info about the struct-module. But I am still interested in the solution about my actual question.


Solution

  • I would use struct.

    import struct
    
    def toU32(bits):
        return struct.unpack_from(">I", bits)[0]
    
    def toS32(bits):
        return struct.unpack_from(">i", bits)[0]
    

    The format string, ">I", means read a big endian, ">", unsigned integer, "I", from the string bits. For signed integers you can use ">i".

    EDIT

    Had to look at another StackOverflow answer to remember how to "convert" a signed integer from an unsigned integer in python. Though it is less of a conversion and more of reinterpreting the bits.

    import struct
    
    def toU32(bits):
            return ord(bits[0]) << 24 | ord(bits[1]) << 16 | ord(bits[2]) << 8  | ord(bits[3])
    
    def toS32(bits):
        candidate = toU32(bits);
        if (candidate >> 31): # is the sign bit set?
            return (-0x80000000 + (candidate & 0x7fffffff)) # "cast" it to signed
        return candidate
    
    
    for x in range(-5,5):
        bits = struct.pack(">i", x)
        print toU32(bits)
        print toS32(bits)