Search code examples
pythonpandasnetworkxminimum-spanning-tree

Networkx graph and labels


I am having some troubles to understand how networkx library works & nodes' labels. Let's assume I have a correlation matrix in a pandas dataframe:

import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt

D = pd.DataFrame\
({'A': [1, 0.5, 0.1], 'B': [0.5, 1, 0.3], 'C': [0.1, 0.3, 1]}, index =  ['A', 'B', 'C'])

I now would like to plot the simple graph representation of this correlation matrix (so a triangle in this example), and then generate the minimum spanning tree for bigger correlation matrices/distance matrices.

corr_graph = nx.from_pandas_adjacency(D)
pos = nx.spring_layout(corr_graph)
nx.draw_networkx_nodes(corr_graph ,pos=pos, label = ['A', 'B', 'C'])
nx.draw_networkx_edges(corr_graph ,pos=pos)
nx.draw_networkx_edge_labels(corr_graph , pos=pos)
plt.axis('off')
plt.show()

So the graph is generated, with correct labels on each edges. On the nodes I have the self-loop edges {'weight':1} but the nodes themselves have no labels and I wanted to have them as A, B and C as in my initial dataframe so I can identify them. My other question is how to remove the self-loop edges labels.

I'd like to do the same with the minimum spanning tree but first I am just trying to do it on the simple graph.

Thank you,


Solution

  • Drawing node labels

    (in-built function):

    nx.draw_networkx_labels(corr_graph, pos=pos)
    

    Removing self loops:

    method 1:

    set the diagonal to zero, and then create the graph:

    # for example
    E = D - np.eye(D.shape[0])
    corr_graph = nx.from_pandas_adjacency(E)
    

    method 2:

    create graph, and only draw edges that have different source and dest.

    corr_graph = nx.from_pandas_adjacency(D)
    edges = [(a,b) for (a,b) in corr_graph.edges() if a != b]
    nx.draw_networkx_edges(corr_graph, edgelist=edges, pos=pos)