with some data,
df = pd.DataFrame({
"data": [
{"a":1},
{"b":2.1}
]
})
data
0 {'a': 1}
1 {'b': 2.1}
expanding the dictionary in the column using .json_normalize
:
df2 = pd.json_normalize(df['data'])
a b
0 1.0 NaN
1 NaN 2.1
How could those values be represented the same as the original data like below:
a b
0 1 NaN
1 NaN 2.1
One dummy solution appears to be to add a string like
df = pd.DataFrame({
"data": [
{"a":1},
{"b":2.1},
{"a": "dummy"}
]
})
then
pd.json_normalize(df['data'])
a b
0 1 NaN
1 NaN 2.1
2 dummy NaN
what would be some smarter solutions?
You could try this:
pd.json_normalize(
df["data"].apply(lambda x: {key: str(value) for key, value in x.items()})
)
# Output in Jupyter cell
a b
0 1 NaN
1 NaN 2.1