Search code examples
endiannesspacketunsigned-integerpacked

How do I interpret a python byte string coming from F1 2020 game UDP packet?


Title may be wildly incorrect for what I'm trying to work out.

I'm trying to interpret packets I am recieving from a racing game in a way that I understand, but I honestly don't really know what I'm looking at, or what to search to understand it. Information on the packets I am recieving here: https://forums.codemasters.com/topic/54423-f1%C2%AE-2020-udp-specification/?tab=comments#comment-532560

I'm using python to print the packets, here's a snippet of the output, which I don't understand how to interpret.

received message: b'\xe4\x07\x01\x03\x01\x07O\x90.\xea\xc2!7\x16\xa5\xbb\x02C\xda\n\x00\x00\x00\xff\x01\x00\x03:\x00\x00\x00 A\x00\x00\xdcB\xb5+\xc1@\xc82\xcc\x10\t\x00\xd9\x00\x00\x00\x00\x00\x12\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00$tJ\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01

I'm very new to coding, and not sure what my next step is, so a nudge in the right direction will help loads, thanks.

This is the python code:

import socket
UDP_IP = "127.0.0.1"
UDP_PORT = 20777
sock = socket.socket(socket.AF_INET, # Internet
              socket.SOCK_DGRAM) # UDP
sock.bind((UDP_IP, UDP_PORT))
while True:
    data, addr = sock.recvfrom(4096)
    print ("received message:", data)

Solution

  • The website you link to is describing the data format. All data represented as a series of 1's and 0's. A byte is a series of 8 1's and 0's. However, just because you have a series of bytes doesn't mean you know how to interpret them. Do they represent a character? An integer? Can that integer be negative? All of that is defined by whoever crafted the data in the first place.

    The type descriptions you see at the top are telling you how to actually interpret that series of 1's and 0's. When you see "unit8", that is an "unsigned integer that is 8 bits (1 byte) long". In other words, a positive number between 0 and 255. An "int8" on the other hand is an "8-bit integer", or a number that can be positive or negative (so the range is -128 to 127). The same basic idea applies to the *16 and *64 variants, just with 16 bits or 64 bits. A float represent a floating point number (a number with a fractional part, such as 1.2345), generally 4 bytes long. Additionally, you need to know the order to interpret the bytes within a word (left-to-right or right-to-left). This is referred to as the endianness, and every computer architecture has a native endianness (big-endian or little-endian).

    Given all of that, you can interpret the PacketHeader. The easiest way is probably to use the struct package in Python. Details can be found here: https://docs.python.org/3/library/struct.html

    As a proof of concept, the following will interpret the first 24 bytes:

    import struct
    data = b'\xe4\x07\x01\x03\x01\x07O\x90.\xea\xc2!7\x16\xa5\xbb\x02C\xda\n\x00\x00\x00\xff\x01\x00\x03:\x00\x00\x00 A\x00\x00\xdcB\xb5+\xc1@\xc82\xcc\x10\t\x00\xd9\x00\x00\x00\x00\x00\x12\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00$tJ\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01'
    
    #Note that I am only taking the first 24 bytes. You must pass data that is
    #the appropriate length to the unpack function. We don't know what everything 
    #else is until after we parse out the header
    header = struct.unpack('<HBBBBQfIBB', data[:24])
    print(header)
    

    You basically want to read the first 24 bytes to get the header of the message. From there, you need to use the m_packetId field to determine what the rest of the message is. As an example, this particular packet has a packetId of 7, which is a "Car Status" packet. So you would look at the packing format for the struct CarStatus further down on that page to figure out how to interpret the rest of the message. Rinse and repeat as data arrives.

    Update: In the format string, the < tells you to interpret the bytes as little-endian with no alignment (based on the fact that the documentation says it is little-endian and packed). I would recommend reading through the entire section on Format Characters in the documentation above to fully understand what all is happening regarding alignment, but in a nutshell it will try to align those bytes with their representation in memory, which may not match exactly the format you specify. In this case, HBBBBQ takes up 2 bytes more than you'd expect. This is because your computer will try to pack structs in memory so that they are word-aligned. Your computer architecture determines the word alignment (on a 64-bit computer, words are 64-bits, or 8 bytes, long). A Q takes a full word, so the packer will try to align everything before the Q to a word. However, HBBBB only requires 6 bytes; so, Python will, by default, pad an extra 2 bytes to make sure everything lines up. Using < at the front both ensures that the bytes will be interpreted in the correct order, and that it won't try to align the bytes.