Search code examples
pythonjsonpandasdataframe

JSON to Pandas Dataframe with null values and missing columns


I am working with a JSON file that is designed as such:

f = {'results':
              [{'tables':
                        [{'rows': [{'column1': 'dog', 'column2': 'blue', 'column3': 'sad'},
                                  { 'column2': 'red', 'column3': 'happy'},
                                  {'column1': 'bird', 'column2': 'green'}]
      }]}]} 

DESIRED pandas dataframe that accounts for rows with missing columns:

      column1     column2     column3
        dog         blue        sad
                    red        happy
        bird        green        

Any suggestions is greatly appreciated.


Solution

  • Once you've read your data, you can normalise it as a normal dictionary, see below should do it:

    import pandas as pd
    
    # read the file
    data = pd.read_json('my_dummy_file.json')
    
    # normalise your data
    rows = data['results'][0]['tables'][0]['rows']
    df = pd.json_normalize(rows)
    
    # fill missing values
    df.fillna('', inplace=True)
    
    print(df)
    
      column1 column2 column3
    0     dog    blue     sad
    1             red   happy
    2    bird   green