Search code examples
pythonjsonpandasgroup-byto-json

pandas groupby and convert to json of defined schema


I have below pandas df :

id  mobile
1   9998887776
2   8887776665
1   7776665554
2   6665554443
3   5554443332

I want to group by on id and expected results as below :

id   mobile
1    [{"9998887776": {"status": "verified"}},{"7776665554": {"status": "verified"}}]
2    [{"8887776665": {"status": "verified"}},{"6665554443": {"status": "verified"}}]
3    [{"5554443332": {"status": "verified"}}]

I know to_json method won't help here and I have to write UDF. But I am new to this and bit stuck here.


Solution

  • Use list comprehension with GroupBy.apply with custom format for lists of dictionaries:

    f = lambda x: [{y: {"status": "verified"}} for y in x]
    df = df.groupby('id')['mobile'].apply(f).reset_index()
    print (df)
       id                                             mobile
    0   1  [{9998887776: {'status': 'verified'}}, {777666...
    1   2  [{8887776665: {'status': 'verified'}}, {666555...
    2   3             [{5554443332: {'status': 'verified'}}]
    

    If need json format:

    import json
    
    f = lambda x: json.dumps([{y: {"status": "verified"}} for y in x])
    df = df.groupby('id')['mobile'].apply(f).reset_index()
    print (df)
       id                                             mobile
    0   1  [{"9998887776": {"status": "verified"}}, {"777...
    1   2  [{"8887776665": {"status": "verified"}}, {"666...
    2   3           [{"5554443332": {"status": "verified"}}]