Search code examples
python-3.xpandasdictionarybyteavro

Create dataframe from a dictionary in python when the key is a byte?


I want to create a dataframe from spesific data inside a dictionary. The key is a byte, and I don't understand how to get the information "out of the byte in a usefull way". If I can get the data I need into a dataframe I would know how to handle it (sort, plot etc.)

I have this dictionary:

{'SequenceNumber': 2654504175, 'Offset': '67826126730624', 'EnqueuedTimeUtc': '7/10/2020 1:18:00 PM', 'SystemProperties': {}, 'Properties': {}, 'Body': b'{"id": "MicroSCADA OPC DA.S_M.APL.1.P.P_R_P.1", "ts": "2020-07-10T13:17:24.654000", "value": 1.1293551921844482, "status_code": 0}'}

It is a result from reading one datapoint in an avro file. The data I need is inside 'Body'.

I go:

x=my_dict.get("Body)

the result is:

b'{"id": "MicroSCADA OPC DA.S_M.APL.1.P.P_R_P.1", "ts": "2020-07-10T13:17:24.654000", "value": 1.1293551921844482, "status_code": 0}'

I would like to sort the data into a dataframe with coulmns "id", "ts", "value", and "status code". How can I do this?

I have also tried pandavro, but the byte is still "locking" the data I need together. I have tried converting the byte to string, but then the key and it's value dosen't naturally belong together any more.

How to solve this in a best possible way?


Solution

  • Convert the byte string to a string, evaluate it and convert it into a dataFrame:

    dic = {'SequenceNumber': 2654504175, 'Offset': '67826126730624', 'EnqueuedTimeUtc': '7/10/2020 1:18:00 PM', 'SystemProperties': {}, 'Properties': {}, 'Body': b'{"id": "MicroSCADA OPC DA.S_M.APL.1.P.P_R_P.1", "ts": "2020-07-10T13:17:24.654000", "value": 1.1293551921844482, "status_code": 0}'}
    
    d = eval(dic["Body"].decode("utf-8"))
    
    df = pd.DataFrame([list(d.values())],columns = list(d.keys()))
    
    df
    

    Output:

        id                                      ts                          value    status_code
    0   MicroSCADA OPC DA.S_M.APL.1.P.P_R_P.1   2020-07-10T13:17:24.654000  1.129355    0