Search code examples
pythondata-structurestreehierarchical-data

How can I convert a list of dicts showing hierarchical relationships into a tree?


I have this list of dicts:

[
    {"node": "root", "children": ["a"]},
    {"node": "a", "children": ["b", "b"]},
    {"node": "b", "children": ["c", "c", "c"]},
    {"node": "c", "children": ["d"]},
]

that represents a compressed tree. What I mean by that is this list of dict represents the following tree:

example uncompressed tree

What kind of a data structure can I convert this list of dict into so I can expand it into a tree? I was thinking of unrolling the list of dicts to something like:

{"root": [
    {"a": [
        {"b": [
            {"c": [
                {"d": "None"}
              ]
            },
            {"c": [
                {"d": "None"}
              ]
            },
            {"c": [
                {"d": "None"}
              ]
            }
          ]
        },
        {"b": [
            {"c": [
                {"d": "None"}
              ]
            },
            {"c": [
                {"d": "None"}
              ]
            },
            {"c": [
                {"d": "None"}
              ]
            }
          ]
        }
      ]
    }
  ]
}

Looks messy, but essentially a nested dict of nodes to list of children. Not really sure how to achieve this. Any other ideas to uncompress this tree are welcome!

Ideally, I would be able to throw this into some tree library like treelib to get methods of listing the leaf nodes, accessing data of parents, grandparents, etc.


Solution

  • First, I would convert this:

    l = [
        {"node": "root", "children": ["a"]},
        {"node": "a", "children": ["b", "b"]},
        {"node": "b", "children": ["c", "c", "c"]},
        {"node": "c", "children": ["d"]},
    ]
    

    into a bit more workable format:

    compressed = {e["node"]: e["children"] for e in l}
    

    Then it's quite simple:

    def expand(compressed, root):
        children = [expand(compressed, n) for n in compressed.get(root, [])]
        return {root: children or "None"}
    

    In your example:

    >>> from pprint import pprint
    >>> pprint(expand(compressed, "root"))
    {'root': [{'a': [{'b': [{'c': [{'d': 'None'}]},
                            {'c': [{'d': 'None'}]},
                            {'c': [{'d': 'None'}]}]},
                     {'b': [{'c': [{'d': 'None'}]},
                            {'c': [{'d': 'None'}]},
                            {'c': [{'d': 'None'}]}]}]}]}
    

    That said, I would recommend not replacing an empty child list with the string "None" and simply have it be an empty list instead (so simply remove the or "None" above).