Search code examples
pythonpandasnetworkx

Coloring specific nodes based on a condition within a network


I have a dataset with a list of accounts sorted by a variable called time

Account Time
13124    1
215732   2 
76239    3
76054    4
975235   

I have also a graph where a full list of accounts is considered:

Account1 Account2
13124    215732
215732   418954
5130953  214182
760524   5398723
975235   13124

The graph is built using networkx:

G = nx.from_pandas_edgelist(df, 'Account1', 'Account2')

I would like to visualize nodes that are in the top list (Account Time) within the network, by highlighting the node based on the time. This could be achieved as follows:

  • either to plot multiple plots showing different time
  • or to plot the graph just coloring the nodes based on the time of opening. For those nodes that are not in the graph or that do not have a time assigned (e.g., 975235) , it would be nice to assign a default color to distinguish them.

I would like to understand better how to select (colorinig) only the nodes on the top list within the network.


Solution

  • A way to do this is to create and pair a colormap with the time associated to your node and then to use the node_color argument of the nx.draw function to color your nodes. You can additionally set up a legend for your nodes by creating empty placeholder scatter plots. See code below for more details:

    import numpy as np
    import matplotlib.pyplot as plt
    import networkx as nx
    import pandas as pd
    from matplotlib import cm
    
    df=pd.read_fwf('graph.txt') #(Account1, Account2) dataframe
    df_time=pd.read_fwf('timestamp.txt') #(Account, Time) dataframe
    
    G = nx.from_pandas_edgelist(df,'Account1', 'Account2')
    
    #Setting up colormap
    N_colors=4
    cm_dis=np.linspace(0, 1,N_colors) 
    colors = [ cm.viridis(x) for x in cm_dis]
    color_edges=[]
    
    #Pairing each node with the a color associated with time of the node
    for node in G:
        temp=df_time.loc[df_time['Account']==node] #Finding time of node 
    
        if temp.empty or temp['Time'].isnull().values.any(): #Checking if there is atime associated to node
    
          color='tab:orange'
          if color not in color_edges: #Setting up legend
            plt.scatter([],[],color='tab:orange',label='No time')
          color_edges.append(color) 
          
        else:
    
          color=colors[int(temp['Time'])]
    
          if color not in color_edges:
             plt.scatter([],[],color=color, label='Time:'+str(int(temp['Time'])))
          color_edges.append(color)
    
    #Drawing graph and legend
    nx.draw(G,with_labels=True,node_color=color_edges)
    plt.legend()
    plt.show()
    

    And the output of this code gives:

    enter image description here