Any ideas on how to transform this JSON file into a usable dataframe format:
pd.read_json("http://api.census.gov/data/2014/acsse/variables.json")
Here's how the table should look: http://api.census.gov/data/2014/acsse/variables.html
Say you start with
df = pd.read_json("http://api.census.gov/data/2014/acsse/variables.json")
The problem is that the column is of dicts:
In [28]: df.variables.head()
Out[28]:
AIANHH {u'concept': u'Selectable Geographies', u'pred...
ANRC {u'concept': u'Selectable Geographies', u'pred...
BST {u'concept': u'Selectable Geographies', u'pred...
CBSA {u'concept': u'Selectable Geographies', u'pred...
CD {u'concept': u'Selectable Geographies', u'pred...
Name: variables, dtype: object
But you can solve this by applying a Series
:
In [27]: df.variables.apply(pd.Series)
Out[27]:
concept \
AIANHH Selectable Geographies
ANRC Selectable Geographies
BST Selectable Geographies
CBSA Selectable Geographies
CD Selectable Geographies
CNECTA Selectable Geographies
...
This is the DataFrame you want, probably, as can be shown by:
In [32]: df.variables.apply(pd.Series).columns
Out[32]: Index([u'concept', u'label', u'predicateOnly', u'predicateType'], dtype='object')