Search code examples
pythongraphnetworkxminimum-spanning-tree

NetworkX Minimum Spanning Tree has different cluster arrangement with the same data?


I have a large dataset which compares products with a relatedness measure which looks like this:

product1      product2  relatedness
0101          0102      0.047619
0101          0103      0.023810
0101          0104      0.095238
0101          0105      0.214286
0101          0106      0.047619
...           ...       ...

I used the following code to feed the data into the NetworkX graphing tool and produce an MST diagram:

import networkx as nx
import matplotlib.pyplot as plt

products = (data['product1'])
products = list(dict.fromkeys(products))
products = sorted(products)

G = nx.Graph()
G.add_nodes_from(products)
print(G.number_of_nodes())
print(G.nodes())

row = 0
for c in data['product1']:
    p = data['product2'][row]
    w = data['relatedness'][row]
    if w > 0:
        G.add_edge(c,p, weight=w, with_labels=True)
    row = row + 1

nx.draw(nx.minimum_spanning_tree(G), with_labels=True)
plt.show()

The resulting diagram looks like this: https://i.sstatic.net/LBrnD.jpg

However, when I re-run the code, with the same data and no modifications, the arrangement of the clusters appears to change, so it then looks different, example here: https://i.sstatic.net/jR62Q.jpg, second example here: https://i.sstatic.net/PLHyo.jpg. The clusters, edges, and weights do not appear to be changing, but the arrangement of them on the graph space is changing each time.

What causes the arrangement of the nodes to change each time without any changes to the code or data? How can I re-write this code to produce a network diagram with approximately the same arrangement of nodes and edges for the same data each time?


Solution

  • The nx.draw method uses by default the spring_layout (link to the doc). This layout implements the Fruchterman-Reingold force-directed algorithm which starts with random initial positions. This is this layout effect that you witness in your repetitive trials.

    If you want to "fix" the positions, then you should explicitely call the spring_layout function and specify the initial positions in the pos argument.