Search code examples
pythonpandasnetworkxgraph-theoryadjacency-matrix

Networkx does not return a nice graph from adjacency matrix


I have a matrix that is as follows:

adjacency_matrix = [['A', 1, 1, 0, 2], ['B', 1, 1, 1, 3], ['C', 0, 0, 1, 1]]

It shows that A is in "Element 1", "Element 2" but not "Element 3" as it has 1, 1 and 0.

B is "Element 1", "Element 2" and "Element 3" as all values are 1s and etc. The last value is the sum of the 0s and 1s in that sublist.

I created a pandas dataframe to save this into a csv file. Before it saves it, it sorts it by the sum and then drops the last column (sum).

df = pd.DataFrame(adjacency_matrix, columns = ["Name", "Element 1", "Element 2", "Element 3", "Sum"])
df = df.sort_values(by=['Sum'], ascending=False)
df = df.iloc[:, :-1]

My next step is to use the adjacency matrix and create a nice graph of connections.

G=from_pandas_edgelist(df, source="Name", target=["Name", "Element 1", "Element 2", "Element 3"])
nx.draw_circular(G, with_labels=True)
plt.axis('equal')
plt.show()

What am I doing wrong? I do not get the undirected graph with "A" connected to both Element 1 and Element 2. I have a feeling my source and target are wrong.

enter image description here


Solution

  • Restructure your adjacency matrix into an edgelist. Here's an example using DataFrame.melt and DataFrame.query:

    df = pd.DataFrame(adjacency_matrix, columns = ["Name", "Element 1", "Element 2", "Element 3", "Sum"])
    df = df.sort_values(by=['Sum'], ascending=False)
    df = df.iloc[:, :-1]
    
    df_edges = (df.melt(id_vars='Name', var_name='target')
                .query('value==1'))
    

    [out]

      Name     target  value
    0    A  Element 1      1
    1    B  Element 1      1
    3    A  Element 2      1
    4    B  Element 2      1
    7    B  Element 3      1
    8    C  Element 3      1
    
    
    G = nx.from_pandas_edgelist(df_edges, source='Name', target='target')
    nx.draw_networkx(G)
    

    enter image description here