Search code examples
pythonjsonpandasgroup-bypymongo

JSON agrouped by one field


I'm trying to transform an dataframe to json but one column must agrouped some others, like that:

| name   | order | age | color    | rank |
| robert | car   | 25  | red      | 5    |
| helena | car   | 45  | yellow   | 4    |
| cleito | car   | 35  | green    | 4.5  |

but the json would be like that to insert into MongoDB:

{
  [
    'name': 'robert',
    'age' : 25,
    'order': 'car', 
    'details': [{
          'color': 'red',
          'rank': 5
           }],
  ...
  ]
}

I had also tried itertools and pandas functions but I didn't found a great way to do that.


Solution

  • You can use to_dict to combine multiple columns.

    df['details'] = df[['color', 'rank']].to_dict(orient='records')
    df = df.drop(['color', 'rank'], axis=1)
    
    # df.to_json(orient='records') or df.to_dict(orient='records')
    

    (I didn't put details in an array though. I was not sure if you want an array of 1 object.)

    Result

    [
      {
        "name": "robert",
        "order": "car",
        "age": 25,
        "details": {
          "color": "red",
          "rank": 5.0
        }
      },
      {
        "name": "helena",
        "order": "car",
        "age": 45,
        "details": {
          "color": "yellow",
          "rank": 4.0
        }
      },
      {
        "name": "cleito",
        "order": "car",
        "age": 35,
        "details": {
          "color": "green",
          "rank": 4.5
        }
      }
    ]