I already browsed Stack Overflow and many sites, but I could not find a solution resolving my problem. I am trying to convert a JSON file received from a multi-dimensionnal inference engine (AI), to an Excel format through a pandas dataframe. This JSON file has a quite complex structure with many levels down (I am not familiar with JSON language).
Here is the JSON format:
{
"data": {
"queries": [{
"id": 292,
"**name**": "Data_8",
"queryId": 0,
"values": {
"Entreprise": {
"F1": {
"rule": 1450,
"**value**": "1.000000000000"
}
}
}
}, {
"id": 430,
"name": "Data_9",
"queryId": 1,
"values": {
"Entreprise": {
"F1": {
"rule": 1437,
"value": "N"
}
}
}
}, {
"id": 359,
"**name**": "Data_10",
"queryId": 2,
"values": {
"Entreprise": {
"F1": {
"rule": 876,
"**value**": "O"
}
}
}
}, and so on.
I load this file with:
with open(output_JSON) as data_file:
data2= json.load(data_file)
From the file, I need to retrieve only two fields: name and the correspondant value, in two columns.
Desired output of the dataframe is :
name value
0 Data_8 1
1 Data_9 N
2 Data_10 O
Thank you for your time and your help.
Anyway, take care and stay safe @ home.
Greetings from Paris, France :)
UPDATE (2/05/2020): File is loaded with :
with open(output_JSON) as data_file:
data2= json.load(data_file)
then:
df = pd.DataFrame(data2['data'])
print (df)
returns:
queries
0 {'id': 292, 'name': 'Data_8', 'queryId': 0, 'v...
1 {'id': 430, 'name': 'Data_9', 'queryId': 1, 'v...
2 {'id': 359, 'name': 'Data_10', 'queryId': 2, '...
then:
df2=df["queries"]
print (df2)
returns quite the same:
0 {'id': 292, 'name': 'Data_8', 'queryId': 0, 'v...
1 {'id': 430, 'name': 'Data_9', 'queryId': 1, 'v...
2 {'id': 359, 'name': 'Data_10', 'queryId': 2, '...
At this point I dont know how to get name and value columns in a dataframe.
Thank you again for your time :)
json_normalize
from pandas.io.json import json_normalize
data2
test.json
{
"data": {
"queries": [{
"id": 292,
"name": "Data_8",
"queryId": 0,
"values": {
"Entreprise": {
"F1": {
"rule": 1450,
"value": "1.000000000000"
}
}
}
}, {
"id": 430,
"name": "Data_9",
"queryId": 1,
"values": {
"Entreprise": {
"F1": {
"rule": 1437,
"value": "N"
}
}
}
}, {
"id": 359,
"name": "Data_10",
"queryId": 2,
"values": {
"Entreprise": {
"F1": {
"rule": 876,
"value": "O"
}
}
}
}
]
}
}
import pandas as pd
from pandas.io.json import json_normalize # not needed for current pandas
from pathlib import Path
import json
# load test.json
p = Path(r'c:\Users\...\test.json')
with p.open('r', encoding='utf-8') as f:
data2 = json.loads(f.read())
# create the dataframe
# df = pd.json_normalize(data2, ['data', 'queries']) # use for modern pandas
df = json_normalize(data2, ['data', 'queries'])
df.rename(columns={'values.Entreprise.F1.value': 'value'}, inplace=True)
df2 = df[['name', 'value']].copy()
print(df)
id name queryId values.Entreprise.F1.rule value queries.name
0 292 Data_8 0 1450 1.000000000000 NaN
1 430 Data_9 1 1437 N NaN
2 359 Data_10 2 876 O NaN
print(df2)
name value
0 Data_8 1.000000000000
1 Data_9 N
2 Data_10 O