Search code examples
pythonjsonpandasdataframeextract

How to extract lowest level of a nested JSON into a DataFrame in Python?


I'm looking to extract only the lowest-level objects of a JSON in Python. For example, the first few records from API look something like this:

{
  "season": 2021,
  "charts": {
    "ARI": {
      "TE": [
        {
          "team": "ARI",
          "position": "TE",
          "depth": "1",
          "playerId": "1443",
          "name": "Dan Arnold"
        },
        {
          "team": "ARI",
          "position": "TE",
          "depth": "2",
          "playerId": "599",
          "name": "Maxx Williams"
        }
      ],
      "K": [
        {
          "team": "ARI",
          "position": "K",
          "depth": "1",
          "playerId": "1121",
          "name": "Zane Gonzalez"
        },
        {

I ultimately want to get all of these results into a DataFrame that has this structure:

| team | position | depth | playerId | name |
|:---- |:-------- |:----- |:-------- |:---- |

I've tried variations of the following code without success:

import requests as rq
import pandas as pd

# Retrieve Depth Charts
json_depthCharts = rq.get(f"https://api.fantasynerds.com/v1/nfl/depth?apikey={API_KEY}").json()

df_depthCharts = pd.json_normalize(json_depthCharts, 'charts', ['charts', 'team'])

print(df_depthCharts)

Any insights are appreciated!


Solution

  • try json_normalize()+melt()+explode()+Dataframe():

    df=pd.DataFrame(pd.json_normalize(json_depthCharts).melt('season')['value'].explode().tolist())
    

    OR

    other way via stack()+drop() combination in place of melt() and rest all methods remains same:

    df=pd.DataFrame(pd.json_normalize(json_depthCharts).drop(columns='season').stack().explode().tolist())
    

    output of df:

            team    position    depth   playerId    name
    0       ARI     TE          1       1443        Dan Arnold
    1       ARI     TE          2       599         Maxx Williams
    2       ARI     K           1       1121        Zane Gonzalez
    3       ARI     K           2       1454        Brett Maher
    4       ARI     LWR         1       338         DeAndre Hopkins
    ...     ...     ...        ...      ...         ...
    932     WAS     LDE         2       179         Ryan Kerrigan
    933     WAS     RCB         1       647         Ronald Darby
    934     WAS     RB          1       1957        Antonio Gibson
    935     WAS     RB          2       1542        Bryce Love
    936     WAS     NB          1       1733        Jimmy Moreland
    
    937 rows × 5 columns