python json pandas dataframe json-normalize

Flattening multi nested json into a pandas dataframe

I'm trying to flatten this json response into a pandas dataframe to export to csv.

It looks like this:

j = [
    {
        "id": 401281949,
        "teams": [
            {
                "school": "Louisiana Tech",
                "conference": "Conference USA",
                "homeAway": "away",
                "points": 34,
                "stats": [
                    {"category": "rushingTDs", "stat": "1"},
                    {"category": "puntReturnYards", "stat": "24"},
                    {"category": "puntReturnTDs", "stat": "0"},
                    {"category": "puntReturns", "stat": "3"},
                ],
            }
        ],
    }
]

...Many more items in the stats area. If I run this and flatten to the teams level:

multiple_level_data = pd.json_normalize(j, record_path =['teams'])

I get:

           school      conference homeAway  points                                              stats
0  Louisiana Tech  Conference USA     away      34  [{'category': 'rushingTDs', 'stat': '1'}, {'ca...

How do I flatten it twice so that all of the stats are on their own column in each row?

If I do this:

multiple_level_data = pd.json_normalize(j, record_path =['teams'])
multiple_level_data = multiple_level_data.explode('stats').reset_index(drop=True)
multiple_level_data=multiple_level_data.join(pd.json_normalize(multiple_level_data.pop('stats')))

I end up with multiple rows instead of more columns:

Solution

You can try:

df = pd.DataFrame(j).explode("teams")
df = pd.concat([df, df.pop("teams").apply(pd.Series)], axis=1)

df["stats"] = df["stats"].apply(lambda x: {d["category"]: d["stat"] for d in x})

df = pd.concat(
    [
        df,
        df.pop("stats").apply(pd.Series),
    ],
    axis=1,
)

print(df)

Prints:

          id          school      conference homeAway  points rushingTDs puntReturnYards puntReturnTDs puntReturns
0  401281949  Louisiana Tech  Conference USA     away      34          1              24             0           3