Search code examples
pythongraphdata-visualizationnetworkx

NetworkX graph: creating nodes with ordered list


I am completely new to graphs. I have a 213 X 213 distance matrix. I have been trying to visualize the distance matrix using network and my idea is that far apart nodes will appear as separate clusters when the graph will be plotted. So I am creating a graph with nodes representing column index. I need to keep track of nodes in order to label it afterwards. I need to add edges in certain order so I need to keep track of nodes and their labels.

Here is the code:

import networkx as nx
G = nx.Graph()
G.add_nodes_from(time_pres) ##time_pres is the list of labels that I want specific node to have

 for i in range(212):
    for j in range(i+1, 212):
        color = ['green' if j == i+1 else 'red'][0]
        edges.append((i,j, dist[i,j], 'green')) ##This thing requires allocation of distance as per the order in dist matrirx
        G.add_edge(i,j, dist = dist[i,j], color = 'green')

The way I am doing right now, it is allocating nodes with id as a number which is not as per the index of labels in time_pres.


Solution

  • I can answer the question you seem to be asking, but this won't be the end of your troubles. Specifically, I'll show you where you go wrong.

    So, we assume that the variable time_pres is defined as follows

    time_pres = [('person1', '1878'), ('person2', '1879'), etc)]
    

    Then,

    G.add_nodes_from(time_pres)
    

    Creates the nodes with labels ('person1', '1878'), ('person2', '1879'), etc. These nodes are held in a dictionary, with keys the label of the nodes and values any additional attributes related to each node. In your case, you have no attributes. You can also see this from the manual online, or if you type help(G.add_nodes_from).

    You can even see the label of the nodes by typing either of the following lines.

    G.nodes()        # either this
    G.node.keys()    # or this
    

    This will print a list of the labels, but since they come from a dictionary, they may not be in the same order as time_pres. You can refer to the nodes by their labels. They don't have any additional id numbers, or anything else.

    Now, for adding an edge. The manual says that any of the two nodes will be added if they are not already in the graph. So, when you do

    G.add_edge(i, j, dist = dist[i,j], color = 'green')
    

    where, i and j are numbers, they are added in the graph since they don't already exist in the graph labels. So, you end up adding the nodes i and j and the edge between them. Instead, you want to do

    G.add_edge(time_pres[i], time_pres[j], dist = dist[i,j], color = 'green')
    

    This will add an edge between the nodes time_pres[i] and time_pres[j]. As far as I understand, this is your aim.


    However, you seem to expect that when you draw the graph, the distance between nodes time_pres[i] and time_pres[j] will be decided by the attribute dist=dist[i,j] in G.add_edge(). In fact, the position of a node is decided by tuple holding the x and y positions of the node. From the manual for nx.draw().

    pos : dictionary, optional

    A dictionary with nodes as keys and positions as values. If not specified a spring layout positioning will be computed. See networkx.layout for functions that compute node positions.

    If you don't define the node positions, they will be generated randomly. In your case, you would need a dictionary like

    pos = {('person1', '1878'): (23, 10),
           ('person2', '1879'): (18, 11),
           etc}
    

    Then, the coordinates between the nodes i and j would result to a distance equal to dist[i,j]. You would have to figure out these coordinates, but since you haven't made it clear exactly how you derived the matrix dist, I can't say anything about it.