Search code examples
pandaslistgraphnetworkxremoving-whitespace

Create a Network based on a specific string column inside dataframe


I have a data frame like this picture below:

enter image description here

But I have some troubles when I want to create this network. I used this code:

G = nx.complete_graph(df.Members)
list(G.edges())

But I got error. My question is how I can create this network based on this type of data. I should mention that each group is a complete graph and I want to assemble these groups into one. Moreover, I have some white space which leads to some duplicated nodes.


Solution

  • Plotting this with your formatting is at least another SO question by itself, so I stuck with parsing the data roughly as you showed it, generating a graph, and doing a crude plot. This is the result of the code block below:

    shapes_graph

    All I did from a plot formatting perspective was

    1. include node labels
    2. fix the positions of nodes B, C, F, and J
    3. fiddle with nx.spring_layout() parameters a bit to make the plot more recognizable.
    import pandas as pd
    import networkx as nx
    
    
    d = {'Group Name': {1: 'Alpha', 2: 'Beta', 3: 'Gamma', 4: 'Omega'}, 'Members': {1: 'A, B, C', 2: 'C, D, E, F', 3: 'F, G, H, I, J', 4: 'J, K, L,M,N, O'}, 'Weight': {1: 'W1', 2: 'W2', 3: 'W3', 4: 'W4'}}
    df = pd.DataFrame.from_dict(d)
    
    subgraphs = []
    for record in df.to_records():
        nodes = [node.strip() for node in record[2].split(",")]
        subgraph = nx.complete_graph(nodes)
        nx.set_edge_attributes(subgraph, record[3], name='Weight')
        subgraphs.append(subgraph)
    
    G = nx.compose_all(subgraphs)
    
    node_types = {'A': 1, 'B': 1, 'C':2, 'D': 1, 'E': 2, 'F': 1, 'G': 1, 'H': 1, 'I': 2, 'J': 2, 'K': 1, 'L': 1, 'M': 2, 'N': 1, 'O': 1}
    nx.set_node_attributes(G, node_types, name='Type')
    
    pos_fixed = {'B': (1, 0),
                 'C': (2, 0),
                 'F': (3, 0),
                 'J': (4, 0),
                 'K': (5, 0)}
    
    pos = nx.spring_layout(G, k=1.25, pos=pos_fixed, fixed=pos_fixed.keys(), seed=42)
    
    nx.draw(G, pos, with_labels=True)