Search code examples
pythonparsingbit-manipulationmaskingbit

Parsing out bit offsets from a hex number in Python


I have a 64-bit hex number inputting into my script 0x0000040800000000. I want to take this number and extract bits 39:32.

How is this possible? I have been parsing individual parts of a string and have ended up in a mess.

I was initially converting it into binary and parsing out sections of the string from

command_register =  "".join(["{0:04b}".format(int(c,16)) for c in str(command_register)])

Solution

  • You simply need to first convert your hex string into an integer and then use normal maths to extract the bits.

    Bit numbering is usually done from the least significant bit, i.e. the furthest right when displayed in binary is bit 0. So to extract bits 39:32 (8 consecutive bits), you would simply need a mask of 0xFF00000000. Simply AND your number and shift the result 32 bits to the right.

    Using your hex value and extracting bits 39 to 32 would give you a value of 0x08. The following script shows you how:

    hex_string = "0x0000040800000000"
    number = int(hex_string, 16)    # Convert to an integer
    mask_39_to_32 = 0xFF00000000    # Suitable mask to extract the bits with
    
    print(f"As hex: 0x{number:X}")
    print()
    print("Bits 39-32:                         xxxxxxxx")
    print(f" As binary: {bin(number)[2:]:0>64s}")
    print(f"      Mask: {bin(mask_39_to_32)[2:]:0>64s}")
    print(f"AND result: {bin(number & mask_39_to_32)[2:]:0>64s}")
    print(f"   Shifted: {bin((number & mask_39_to_32) >> 32)[2:]:0>64s}")
    print(f" As an int: {(number & mask_39_to_32) >> 32}")
    

    Which displays the following output:

    As hex: 0x40800000000
    
    Bits 39-32:                         xxxxxxxx
     As binary: 0000000000000000000001000000100000000000000000000000000000000000
          Mask: 0000000000000000000000001111111100000000000000000000000000000000
    AND result: 0000000000000000000000000000100000000000000000000000000000000000
       Shifted: 0000000000000000000000000000000000000000000000000000000000001000
     As an int: 8
    

    The mask needed for 47 to 40 would be:

    Bits 47-40:                 xxxxxxxx
     As binary: 0000000000000000111111110000000000000000000000000000000000000000
        As hex: 0xFF0000000000
    

    The use of hexadecimal simply makes it less verbose, and clearer once you get used to it. Groups of 8 bits for masks always end up as 'FF'.

    The Wikipedia article on bitwise operations should help you to understand the process.