how to get a cross table from nested dictionaries in a json file

I would like to get a cross-table by using pandas dataframe with Python like this:

	Home	Away	Tot
played	19	19	38
wins	10	8	18
draws	7	5	12
loss	2	6	8
goals_for	40	26	66
goals_against	17	19	36
avg_goal_for	2.1	1.4	1.7
avg_goal_against	0.9	1.0	0.9

and the json file is :

{
"get": "teams/statistics",
"response": {
    "fixtures": {
        "played": {
            "home": 19,
            "away": 19,
            "total": 38
        },
        "wins": {
            "home": 10,
            "away": 8,
            "total": 18
        },
        "draws": {
            "home": 7,
            "away": 5,
            "total": 12
        },
        "loses": {
            "home": 2,
            "away": 6,
            "total": 8
        }
    },
    "goals": {
        "for": {
            "total": {
                "home": 40,
                "away": 26,
                "total": 66
            },
            "average": {
                "home": "2.1",
                "away": "1.4",
                "total": "1.7"
            }
        },
        "against": {
            "total": {
                "home": 17,
                "away": 19,
                "total": 36
            },
            "average": {
                "home": "0.9",
                "away": "1.0",
                "total": "0.9"
            }
        }
    }

}

}

I don't know how to handle all nested dictionaries in an easy way. Please someone can help me out. Thanks

Solution

One way to do this is to process both halves of your response separately, using json_normalize to read the data, then splitting the column names to a multi-level index and then using stack to move to a long format from the wide format. You can then drop the top-level index and concat the two dataframes:

data = json.loads(jstr)

fixtures = pd.json_normalize(data['response']['fixtures'])
fixtures.columns = fixtures.columns.str.split('.', expand=True)
fixtures = fixtures.stack(level=0)
fixtures.index = fixtures.index.droplevel()
fixtures
#         away  home  total
# draws      5     7     12
# loses      6     2      8
# played    19    19     38
# wins       8    10     18

goals = pd.json_normalize(data['response']['goals'])
goals.columns = goals.columns.str.split(r'\.(?!.*\.)', expand=True)
goals = goals.stack(level=0)
goals.index = goals.index.droplevel()
goals
#                 away home total
# against.average  1.0  0.9   0.9
# against.total     19   17    36
# for.average      1.4  2.1   1.7
# for.total         26   40    66

out = pd.concat([fixtures, goals])

Output:

                away home total
draws              5    7    12
loses              6    2     8
played            19   19    38
wins               8   10    18
against.average  1.0  0.9   0.9
against.total     19   17    36
for.average      1.4  2.1   1.7
for.total         26   40    66

Note, if desired you can rename the index column to match your question using

out.index = ['draws', 'loses', 'played', 'wins', 'goals_against_average', 'goals_against_total', 'goals_for_average', 'goals_for_total']

Output:

                      away home total
draws                    5    7    12
loses                    6    2     8
played                  19   19    38
wins                     8   10    18
goals_against_average  1.0  0.9   0.9
goals_against_total     19   17    36
goals_for_average      1.4  2.1   1.7
goals_for_total         26   40    66

Note this code can be made more generic, by creating a dict of the sub-dictionary names and the appropriate splitting string, and then iterating that dict:

parts = { 'fixtures' : '.', 'goals' : r'\.(?!.*\.)' }
dfs = []
for part in parts:
    df = pd.json_normalize(data['response'][part])
    df.columns = df.columns.str.split(parts[part], expand=True)
    df = df.stack(level=0)
    df.index = df.index.droplevel()
    dfs.append(df)

out = pd.concat(dfs)