Search code examples
pythondata-structurestreeconsole

Vertical data tree in console with coordinates with Python


I have been trying a lot, but I am completely lost in all my different attempts at coding this.

What I need seems rather simple:

I have data structured like this in a list of dicts:

units = [
    {
        "name": "A",
        "parent": None
    },
    {
        "name": "B",
        "parent": "A"
    },
    {
        "name": "C",
        "parent": "A"
    },
    {
        "name": "D",
        "parent": "A"
    },
    {
        "name": "E",
        "parent": None
    }
]

I simply want to structure this data in a vertical data tree in the console and pull the coordinates of these structures for further use.

I have tried using multiple libraries, structuring the data myself in table grids etc. Would really appreciate if someone could help me out here.

EDIT: Wanted outcome:

Graphic visualization: https://pasteboard.co/qq2A0U5Pd0Yr.png

Coordinates could be presented like this: (x,y) https://pasteboard.co/kM9S4bxISP96.png A: 2,1 B: 1,2 C: 1,2 D: 1,3 E: 14,1


Solution

  • In general, when you want to work with related nodes in Python, networkx is the way to go. This provides both visualization and graph analysis tools, and is well worth the time and effort to at least get the basics down.

    So for your visualization, it's just this:

    units=[
        { "name": "A",
            "parent": None},{"name": "B",
            "parent": "A"
        },
        {"name": "C",
            "parent": "A"
        },
        {"name": "D",
            "parent": "A"
        },
        {"name": "E",
            "parent": None}]
    
    import matplotlib.pyplot as plt
    import networkx as nx
    import pydot
    from networkx.drawing.nx_pydot import graphviz_layout
    
    # setup the edges of the graph
    rels = [[d["parent"], d["name"]] for d in units if d["parent"]]
    # setup the nodes
    nodes = [d["name"] for d in units]
    # create your directed graph
    g=nx.DiGraph()
    g.add_nodes_from(nodes)
    g.add_edges_from(rels)
    
    # use graphviz to generate the layout positions
    pos = graphviz_layout(g, prog="dot")
    # draw the graph
    nx.draw(g, pos, node_color="w", node_size=200, with_labels=True)
    plt.show()
    

    enter image description here

    Your 'coordinates' requirements seems to be not fully expressed - not sure what your objective is, but if you really want something like what you've shown, you can get that with this (quite hacky) pandas-based solution:

    import pandas as pd
    someDF = pd.DataFrame(rels).T
    coordDF = pd.concat([someDF, pd.DataFrame(["E", None], columns=[3])], axis=1)
    coordDF.iloc[0,0] = None
    coordDF.iloc[0,2] = None
    

    enter image description here

    Notes:

    • having put your nodes and relationships into a networkx graph, you've got a wonderful array of functions you can use to analyze or display the relationships (https://networkx.org/documentation/stable/reference/introduction.html)
    • it's been awhile since I first starting using this - I think you have to install graphviz separately (https://graphviz.org/download/)
    • networkx has lots of other visualization styles that don't require graphviz, but they're generally intended for more complex structures

    Here's how you would use one of those more complex structures - a multi-partite graph, in which you can assign nodes to layers explicitly:

    import networkx as nx
    import matplotlib.pyplot as plt
    nodes = {'A':'A',
    'B':'B',
    'C':'C',
    'D':'D',
    'E':'E'}
    edges = [('A', 'B'),
             ('A', 'C'),
             ('A', 'D')]
    layers = {'A': 1,
    'E': 1,
    'B':2,
    'C':2,
    'D':2}
    
    nx_graph = nx.DiGraph()
    plt.figure(figsize=(4,4))
    for key, value in nodes.items():
        nx_graph.add_node(key, name=value, layer=layers[key])
    for edge in edges:
        nx_graph.add_edge(*edge)
    pos = nx.multipartite_layout(nx_graph, subset_key="layer")
    
    
    # set the location of the nodes for each set
    nodes_0  = set([n for n in nodes if  nx_graph.nodes[n]['layer']==1])
    nodes_1  = set([n for n in nodes if  nx_graph.nodes[n]['layer']==2])
    # by default, this graph goes left to right - you have to change the coords to make it top-down
    pos.update((n, (i+1, 1)) for i, n in enumerate(nodes_0))
    pos.update((n, (i, -1)) for i, n in enumerate(nodes_1))
    nx.draw(nx_graph, pos=pos, labels=nodes, with_labels=True, node_color='w')
    plt.show()
    

    This gives you pretty much the same looking graph:

    enter image description here

    With the bonus that the positions match up a bit better with your 'coordinates' requirement:

    print(pos)
    {'E': (2, 1), 'A': (1, 1), 'D': (0, -1), 'C': (1, -1), 'B': (2, -1)}