To use streamlit_tree_select
I need to convert a dataframe to its expected structure.
I guess to achieve the goal I could use pandas.groupby('parkey')
to group the children, but I'm not sure how to apply this to the appropriate parents while iterating the groups.
The dataframe holding categories:
import pandas as pd
data = [
{"idnr": 1,"parkey": 0,"descr": "A","info":"string"},
{"idnr": 2,"parkey": 0,"descr": "B","info":"string"},
{"idnr": 3,"parkey": 2,"descr": "B B 1","info":"string"},
{"idnr": 4,"parkey": 3,"descr":"B B B 1","info":"string"},
{"idnr": 5,"parkey": 3,"descr":"B B B 2","info":"string"}
]
The expected output:
output = [
{"idnr": 1,"parkey": 0,"descr": "A","info":"string"},
{"idnr": 2,"parkey": 0,"descr": "B","info":"string","children":[
{"idnr": 3,"parkey": 2,"descr": "B B 1","info":"string","children":[
{"idnr": 4,"parkey": 3,"descr":"B B B 1","info":"string"},
{"idnr": 5,"parkey": 3,"descr":"B B B 2","info":"string"}
]}
]
}
]
One way to do this is to pre-process the data, forming a dict with the children of each of the parents. You can then process the 0
property of this dict, recursively adding children from the dict to the appropriate children
array:
def add_child(tree, child):
key = child['parkey']
tree[key] = tree.get(key, []) + [child]
parents = dict()
for child in data:
add_child(parents, child)
Output:
{
0: [
{'idnr': 1, 'parkey': 0, 'descr': 'A', 'info': 'string'},
{'idnr': 2, 'parkey': 0, 'descr': 'B', 'info': 'string'}
],
2: [
{'idnr': 3, 'parkey': 2, 'descr': 'B B 1', 'info': 'string'}
],
3: [
{'idnr': 4, 'parkey': 3, 'descr': 'B B B 1', 'info': 'string'},
{'idnr': 5, 'parkey': 3, 'descr': 'B B B 2', 'info': 'string'}
]
}
Now you can iterate the entries in parents[0]
, recursively adding children as you go:
def add_children(tree, parents):
for child in tree:
# any children
idnr = child['idnr']
if idnr in parents:
# add the children
child['children'] = parents[idnr]
add_children(child['children'], parents)
output = parents[0]
add_children(output, parents)
Output:
[
{'idnr': 1, 'parkey': 0, 'descr': 'A', 'info': 'string'},
{'idnr': 2, 'parkey': 0, 'descr': 'B', 'info': 'string', 'children': [
{'idnr': 3, 'parkey': 2, 'descr': 'B B 1', 'info': 'string', 'children': [
{'idnr': 4, 'parkey': 3, 'descr': 'B B B 1', 'info': 'string'},
{'idnr': 5, 'parkey': 3, 'descr': 'B B B 2', 'info': 'string'}
]
}
]
}
]
Notes:
add_children
routine modifies the data
list as it relies on references to work. If you don't want to that, make a copy of data
first or change the add_child
code to make copies when assigning child
values.add_child
and add_children
, however by splitting the task it means that data
does not have to be sorted by parkey
.