Search code examples
pythonjsonpandasdataframejson-normalize

Flattening multi nested json into a pandas dataframe


I'm trying to flatten this json response into a pandas dataframe to export to csv.

It looks like this:

j = [
    {
        "id": 401281949,
        "teams": [
            {
                "school": "Louisiana Tech",
                "conference": "Conference USA",
                "homeAway": "away",
                "points": 34,
                "stats": [
                    {"category": "rushingTDs", "stat": "1"},
                    {"category": "puntReturnYards", "stat": "24"},
                    {"category": "puntReturnTDs", "stat": "0"},
                    {"category": "puntReturns", "stat": "3"},
                ],
            }
        ],
    }
]

...Many more items in the stats area. If I run this and flatten to the teams level:

multiple_level_data = pd.json_normalize(j, record_path =['teams'])

I get:

           school      conference homeAway  points                                              stats
0  Louisiana Tech  Conference USA     away      34  [{'category': 'rushingTDs', 'stat': '1'}, {'ca...

How do I flatten it twice so that all of the stats are on their own column in each row?

If I do this:

multiple_level_data = pd.json_normalize(j, record_path =['teams'])
multiple_level_data = multiple_level_data.explode('stats').reset_index(drop=True)
multiple_level_data=multiple_level_data.join(pd.json_normalize(multiple_level_data.pop('stats')))

I end up with multiple rows instead of more columns:

enter image description here


Solution

  • You can try:

    df = pd.DataFrame(j).explode("teams")
    df = pd.concat([df, df.pop("teams").apply(pd.Series)], axis=1)
    
    df["stats"] = df["stats"].apply(lambda x: {d["category"]: d["stat"] for d in x})
    
    df = pd.concat(
        [
            df,
            df.pop("stats").apply(pd.Series),
        ],
        axis=1,
    )
    
    print(df)
    

    Prints:

              id          school      conference homeAway  points rushingTDs puntReturnYards puntReturnTDs puntReturns
    0  401281949  Louisiana Tech  Conference USA     away      34          1              24             0           3