Search code examples
pythonjsonpandasdictionaryconvertto-json

Convert dataframe columns into dict and save it as json file, with columns name as dict key and row values as dict values


I have a pandas dataframe as below :

device_id   user_id level
d_id1   u_id1   1
d_id1   u_id2   -1
d_id1   u_id3   1
d_id2   u_id100 -1
d_id3   u_id100 1
d_id3   u_id4   1

I want to convert it into a list of dictionaries depend on columns and rows that looks like:

[
{
"device_id":d_id1
"result_key":[
                {"user_id":u_id1,"level":1},
                {"user_id":u_id2,"level":-1},
                {"user_id":u_id3,"level":1}
             ]
},
{
"device_id":d_id2
"result_key":[{"user_id":u_id100,"level":-1}]
},
{
"device_id":d_id3
"result_key":[{"user_id":u_id100,"level":1},
              {"user_id":u_id4,"level":1}
              ]
}
]

What is the effecient way to do that and save it into json file?


Solution

  • Use GroupBy.apply with DataFrame.to_dict for result_key column first, then for dictionary use to_dict with both columns:

    d = (df.groupby('device_id')[['user_id','level']].apply(lambda x: x.to_dict('records'))
           .reset_index(name='result_key')
           .to_dict('records'))
    print (d)
    [{'device_id': 'd_id1', 'result_key': [{'user_id': 'u_id1', 'level': 1}, 
                                           {'user_id': 'u_id2', 'level': -1}, 
                                           {'user_id': 'u_id3', 'level': 1}]}, 
     {'device_id': 'd_id2', 'result_key': [{'user_id': 'u_id100', 'level': -1}]},
     {'device_id': 'd_id3', 'result_key': [{'user_id': 'u_id100', 'level': 1}, 
                                           {'user_id': 'u_id4', 'level': 1}]}]
    

    Last convert dictionary to json file:

    import json
    with open('out.json', 'w') as f:
        json.dump(d, f)