Search code examples
pythonparsingprotocolsconstruct

In python construct library (for parsing binary data), how to group the rest of data as one field?


I am using Python construct library to parse Bluetooth protocols. The link of the library is here

As the protocols are really complex, I subdivided the parsing into multiple stages instead of building one giganic construct. Right now I already parse the big raw data into this structure:

Container({'CRC': 'd\xcbT',
 'CRC_OK': 1,
 'Channel': 38,
 'RSSI': 43,
 'access_addr': 2391391958L,
 'header': Container({'TxAdd': False, 'PDU_length': 34, 'PDU_Type': 'ADV_IND', 'RxAdd': False}),
 'payload': '2\x15\x00a\x02\x00\x02\x01\x06\x07\x03\x03\x18\x02\x18\x04\x18\x03\x19\x00\x02\x02\n\xfe\t\tAS-D1532'})

As you can see the length of the payload is indicated as PDU_length which is 34. The payload has the following structure:

[first 6 octets: AdvertAddress][the rest of data of 0-31 octets: AdvertData]

However, when I started to parse the payload as a standalone structure, I lost the length of 34 in the context of the construct of the payload. How can I make a construct that will parse the first 6 octects as AdvertAddress and group the rest of data as AdvertData?

My current solution looks like this:

length = len(payload) #I didn't use PDU_length but len(payload) gives me back 34 also.
ADVERT_PAYLOAD = Struct("ADVERT_PAYLOAD",
    Field("AdvertAddress",6),
    Field("AdvertData",length-6),
)
print ADVERT_PAYLOAD.parse(payload)

This gives the correct output. But apparently not all payloads are of size 34. This method requires me to construct this ADVERT_PAYLOAD eveytime I need to parse a new payload.

I read the documentations many times but couldn't find anything related. There is neither a way for me to pass the knowledge of the length of the payload into the context of ADVERT_PAYLOAD, nor is it able to get the length of the argument passed into the parse method.

Maybe there is no solutions to this problem. But then, how do most people parse such protocol data? As you go further into the payload, it subdivides into more types and you need more more smaller constructs to parse them. Should I build a parent construct, embedding smaller constructs which embed even smaller constructs? I can't imagine how to go about building such a big thing.

Thanks in advance.


Solution

  • GreedyRange will get a list of char, and JoinAdapter will join all the char together:

    class JoinAdapter(Adapter):
        def _decode(self, obj, context):
            return "".join(obj)
    
    ADVERT_PAYLOAD = Struct("ADVERT_PAYLOAD",
        Field("AdvertAddress",6),
        JoinAdapter(GreedyRange(Field("AdvertData", 1)))
    )
    
    payload = '2\x15\x00a\x02\x00\x02\x01\x06\x07\x03\x03\x18\x02\x18\x04\x18\x03\x19\x00\x02\x02\n\xfe\t\tAS-D1532'
    print ADVERT_PAYLOAD.parse(payload)
    

    output:

    Container:
        AdvertAddress = '2\x15\x00a\x02\x00'
        AdvertData = '\x02\x01\x06\x07\x03\x03\x18\x02\x18\x04\x18\x03\x19\x00\x02\x02\n\xfe\t\tAS-D1532'