Given the following Pandas DataFrame (the original DataFrame has 200+ rows):
import pandas as pd
df = pd.DataFrame({
'child': ['Europe', 'France', 'Paris','North America', 'US', 'Canada'],
'parent': ["", 'Europe', 'France',"", 'North America', 'North America'],
'value': [746.4, 67.75, 2.16, 579,331.9, 38.25]
})
df
|---+---------------+---------------+--------|
| | child | parent | value |
|---+---------------+---------------+--------|
| 0 | Europe | | 746.40 |
| 1 | France | Europe | 67.75 |
| 2 | Paris | France | 2.16 |
| 3 | North America | | 579.00 |
| 4 | US | North America | 331.90 |
| 5 | Canada | North America | 38.25 |
|---+---------------+---------------+--------|
I want to generate the following JSON tree:
[
{
name: 'Europe',
value: 746.4,
children: [
{
name: 'France',
value: 67.75,
children: [
{
name: 'Paris',
value: 2.16
}
]
}
]
},
{
name: 'North America',
value: 579,
children: [
{
name: 'US',
value: 331.9,
},
{
name: 'Canada',
value: 38.25
}
]
}
];
This tree will be used as an input for ECharts visualizations, like for example this basic sunburst chart.
There is a library called bigtree
which can do exactly what you are looking for.
import json
import bigtree
# Set the parent values for children without parents to ROOT
df["parent"] = df["parent"].replace(r'^$', "ROOT", regex = True)
tree = bigtree.dataframe_to_tree_by_relation(df, "child", "parent")
# tree.show(all_attrs = True)
# Convert to dict and discard the ROOT key
tree_dict = bigtree.tree_to_nested_dict(tree, all_attrs = True)["children"]
# Convert the dict to the desired string format
print(json.dumps(tree_dict, indent = 2))
Also see: Read data from a pandas DataFrame and create a tree using anytree in python