Search code examples
pythonjsonpandascensus

How to convert JSON into dataframe


Any ideas on how to transform this JSON file into a usable dataframe format:

pd.read_json("http://api.census.gov/data/2014/acsse/variables.json")

Here's how the table should look: http://api.census.gov/data/2014/acsse/variables.html


Solution

  • Say you start with

    df = pd.read_json("http://api.census.gov/data/2014/acsse/variables.json")
    

    The problem is that the column is of dicts:

    In [28]: df.variables.head()
    Out[28]: 
    AIANHH    {u'concept': u'Selectable Geographies', u'pred...
    ANRC      {u'concept': u'Selectable Geographies', u'pred...
    BST       {u'concept': u'Selectable Geographies', u'pred...
    CBSA      {u'concept': u'Selectable Geographies', u'pred...
    CD        {u'concept': u'Selectable Geographies', u'pred...
    Name: variables, dtype: object
    

    But you can solve this by applying a Series:

    In [27]: df.variables.apply(pd.Series)
    Out[27]: 
                                                             concept  \
    AIANHH                                    Selectable Geographies   
    ANRC                                      Selectable Geographies   
    BST                                       Selectable Geographies   
    CBSA                                      Selectable Geographies   
    CD                                        Selectable Geographies   
    CNECTA                                    Selectable Geographies   
    ...
    

    This is the DataFrame you want, probably, as can be shown by:

    In [32]: df.variables.apply(pd.Series).columns
    Out[32]: Index([u'concept', u'label', u'predicateOnly', u'predicateType'], dtype='object')