I need one help to convert json data into dataframe. Could you please help me how to do this?
Example:
JSON DATA
{
"user_id": "vmani4",
"password": "*****",
"api_name": "KOL",
"body": {
"api_name": "KOL",
"columns": [
"kol_id",
"jnj_id",
"kol_full_nm",
"thrc_cd"
],
"filter": {
"kol_id": "101152",
"jnj_id": "7124166",
"thrc_nm": "VIR"
}
}
}
Desirable output:
user_id password api_name columns filter filter_value
vmani ****** KOL kol_id kol_id 101152
jnj_id jnj_id 7124166
kol_full_nm thrc_nm VIR
thrc_cd
I'm not familiar with DataFrame but I tried my best to come up with the solution of you desired output in proper way.
import pandas as pd
import json
import numpy as np
json_data = """ {
"user_id": "vmani4",
"password": "*****",
"api_name": "KOL",
"body": {
"api_name": "KOL",
"columns": [
"kol_id",
"jnj_id",
"kol_full_nm",
"thrc_cd"
],
"filter": {
"kol_id": "101152",
"jnj_id": "7124166",
"thrc_nm": "VIR"
}
}
}"""
python_data = json.loads(json_data)
filter = {}
list_for_filter = []
filter_value = {}
list_for_filter_value = []
first_level = {}
for_colums = {}
for x, y in python_data.items():
if type(y) is dict:
for j, k in y.items():
if j == 'columns':
for_colums[j] = k
if type(k) is dict:
for m, n in k.items():
list_for_filter.append(m)
list_for_filter_value.append(n)
break
first_level[x] = [y]
filter['filter'] = list_for_filter
filter_value['filter_value'] = list_for_filter_value
res = {**first_level, **for_colums, **filter, **filter_value}
df = pd.concat([pd.Series(v, name=k) for k, v in res.items()], axis=1)
print(df)
user_id password api_name columns filter filter_value
0 vmani4 ***** KOL kol_id kol_id 101152
1 NaN NaN NaN jnj_id jnj_id 7124166
2 NaN NaN NaN kol_full_nm thrc_nm VIR
3 NaN NaN NaN thrc_cd NaN NaN
Let me give you short hand about my code first created a lot of lists
and dicts
the reason why I did so is that I saw in your desired output some columns that weren't actually in your code like filter_value
.
I also loop trough the dict items in order to make another dict which will satisfy the desired output.
after of all because of the length of lists in the DataFrame where not equal that's why I used concat
and series