Search code examples
pythonjsonpandasjsonb

JSONB Type -split in multiple columns


I have a jsonb column that came from database and i want to split this Column and create separate columns for the Values,Field,Timestamp. My current dataframe is :

Id Fields"
1009331137 {"101011182899": {"Value": "12.3", "Timestamp": "2021-01-08"}, "101011402899": {"Value": "34.2", "Timestamp": "2021-01-08"}}

This is the desired output :

Id Field Value timestamp
1009331137 101011182899 12.3 2021-01-08
1009331137 101011402899 34.2 2021-01-08

Solution

  • We can slightly change the structure of the dictionary to get the output we want. So let's convert nested dictionary to flattened dictionary.

    df["Fields"] = df["Fields"].apply(lambda x: [{"Field":k,**v} for k,v in x.items()])
    

    New format:

        id          Fields
    0   1009331137  [{'Field': '101011182899', 'Value': '12.3', 'Timestamp': '2021-01-08'}, {'Field': '101011402899', 'Value': '34.2', 'Timestamp': '2021-01-08'}]
    

    Now we can use explode() and json_normalize()

    df = df.explode("Fields").reset_index(drop=True)
    df = df.join(pd.json_normalize(df.pop("Fields")))
    

    Out:

               id         Field Value   Timestamp
    0  1009331137  101011182899  12.3  2021-01-08
    1  1009331137  101011402899  34.2  2021-01-08