I'm trying to expand nested json array in pandas dataframe.
That's the JSON I have:
[ {
"id": "0001",
"name": "Stiven",
"location": [{
"country": "Colombia",
"department": "Chocó",
"city": "Quibdó"
}, {
"country": "Colombia",
"department": "Antioquia",
"city": "Medellin"
}, {
"country": "Colombia",
"department": "Cundinamarca",
"city": "Bogotá"
}
]
}, {
"id": "0002",
"name": "Jhon Jaime",
"location": [{
"country": "Colombia",
"department": "Valle del Cauca",
"city": "Cali"
}, {
"country": "Colombia",
"department": "Putumayo",
"city": "Mocoa"
}, {
"country": "Colombia",
"department": "Arauca",
"city": "Arauca"
}
]
}, {
"id": "0003",
"name": "Francisco",
"location": [{
"country": "Colombia",
"department": "Atlántico",
"city": "Barranquilla"
}, {
"country": "Colombia",
"department": "Bolívar",
"city": "Cartagena"
}, {
"country": "Colombia",
"department": "La Guajira",
"city": "Riohacha"
}
]
}
]
That's the dataframe I have:
index id name location
0 0001 Stiven [{'country':'Colombia', 'department': 'Chocó', 'city': 'Quibdó'}, {'country':'Colombia', 'department': 'Antioquia', 'city': 'Medellin'}, {'country':'Colombia', 'department': 'Cundinamarca', 'city': 'Bogotá'}]
1 0002 Jhon Jaime [{'country':'Colombia', 'department': 'Valle del Cauca', 'city': 'Cali'}, {'country':'Colombia', 'department': 'Putumayo', 'city': 'Mocoa'}, {'country':'Colombia', 'department': 'Arauca', 'city': 'Arauca'}]
2 0003 Francisco [{'country':'Colombia', 'department': 'Atlántico', 'city': 'Barranquilla'}, {'country':'Colombia', 'department': 'Bolívar', 'city': 'Cartagena'}, {'country':'Colombia', 'department': 'La Guajira', 'city': 'Riohacha'}]
I need to convert to dataframe per id something like this:
index id name country department city
0 0001 Stiven Colombia Chocó Quibdó
1 0001 Stiven Colombia Antioquia Medellin
2 0001 Stiven Colombia Cundinamarca Bogotá
3 0002 Jhon Jaime Colombia Valle del Cauca Cali
4 0002 Jhon Jaime Colombia Putumayo Mocoa
5 0002 Jhon Jaime Colombia Arauca Arauca
6 0003 Francisco Colombia Atlántico Barranquilla
7 0003 Francisco Colombia Bolívar Cartagena
8 0003 Francisco Colombia La Guajira Riohacha
JSON
is being loaded from a file, use json.loads
, but if the JSON
is directly from an API, it may not be necessary.pandas.json_normalize
with the meta
parameter, to convert the JSON
into a DataFrame.import pandas as pd
from pathlib import Path
import json
# path to file
p = Path(r'c:\path_to_file\test.json')
# read json
with p.open('r', encoding='utf-8') as f:
data = json.loads(f.read())
# create dataframe
df = pd.json_normalize(data, record_path='location', meta=['id', 'name'])
# output
country department city id name
Colombia Chocó Quibdó 0001 Stiven
Colombia Antioquia Medellin 0001 Stiven
Colombia Cundinamarca Bogotá 0001 Stiven
Colombia Valle del Cauca Cali 0002 Jhon Jaime
Colombia Putumayo Mocoa 0002 Jhon Jaime
Colombia Arauca Arauca 0002 Jhon Jaime
Colombia Atlántico Barranquilla 0003 Francisco
Colombia Bolívar Cartagena 0003 Francisco
Colombia La Guajira Riohacha 0003 Francisco