Search code examples
pythonbinarybytebit-shiftnmea

Read binary file and check with matching character in python


I would like to scan through data files from GPS receiver byte-wise (actually it will be a continuous flow, not want to test the code with offline data). If find a match, then check the next 2 bytes for the 'length' and get the next 2 bytes and shift 2 bits(not byte) to the right, etc. I didn't handle binary before, so stuck in a simple task. I could read the binary file byte-by-byte, but can not find a way to match by desired pattern (i.e. D3).

with open("COM6_200417.ubx", "rb") as f:
byte = f.read(1)  # read 1-byte at a time
while byte != b"":
    # Do stuff with byte.
    byte = f.read(1)
    print(byte) 

The output file is:

b'\x82'
b'\xc2'
b'\xe3'
b'\xb8'
b'\xe0'
b'\x00'
b'@'
b'\x13'
b'\x05'
b'!'
b'\xd3'
b'\x00'
b'\x13'

.... how to check if that byte is == '\xd3'? (D3) also would like to know how to shift bit-wise, as I need to check decimal value consisting of 6 bits (1-byte and next byte's first 2-bits). Considering, taking 2-bytes(8-bits) and then 2-bit right-shift to get 6-bits. Is it possible in python? Any improvement/addition/changes are very much appreciated.

ps. can I get rid of that pesky 'b' from the front? but if ignoring it does not affect then no problem though.

Thanks in advance.


Solution

  • 'That byte' is represented with a b'' in front, indicating that it is a byte object. To get rid of it, you can convert it to an int:

    thatbyte = b'\xd3'
    byteint = thatbyte[0]  # or
    int.from_bytes(thatbyte, 'big')  # 'big' or 'little' endian, which results in the same when converting a single byte
    

    To compare, you can do:

    thatbyte == b'\xd3'
    

    Thus compare a byte object with another byte object. The shift << operator works on int only

    To convert an int back to bytes (assuming it is [0..255]) you can use:

    bytes([byteint])   # note the extra brackets!
    

    And as for improvements, I would suggest to read the whole binary file at once:

    with open("COM6_200417.ubx", "rb") as f:
        allbytes = f.read() # read all
        for val in allbytes:
            # Do stuff with val, val is int !!!
            print(bytes([val]))