Search code examples
pythoncolorsnetworkxedgesreindex

How to make edge types into categories? Especially how to reindex the df according to G.edges()?


I am drawing a graph in networkx. The nodelist and edgelist are read from csv using pandas. The columns of edgelist include: 'source', 'target', 'value', like this: source target value A B 1 A C 2 A D 3 H B 1 H E 2

I need to map colors to each type of the edges. I tried to apply pd.Categorical() for this procedure. However, it requires the df to be reindexed according to a collection, which I suppose refers to G.edges(). But, when I use edge = edge.reindex(G.edges() ), the edge['value'].cat.codes turns wrong, like this:

A B -1 C -1 D -1 H B -1 E -1

It seems the duplicated nodes in the source column are lost automatically. And this leads to a missed mapping of colors to the edges. How to solve the problem? Thanks.


Solution

  • If you are trying to use the value column of your dataframe as a categorical variable to color your edges by, I don't think you need to do anything more to the dataframe. Just pass the value column as an edge attribute when building your network.

    Then, when you plot, you can color by the list of edge attributes:

    import pandas as pd
    import networkx as nx
    from matplotlib.cm import get_cmap
    
    data = {'source': {0: 'A', 1: 'A', 2: 'A', 3: 'H', 4: 'H'},
            'target': {0: 'B', 1: 'C', 2: 'D', 3: 'B', 4: 'E'},
            'value': {0: 1, 1: 2, 2: 3, 3: 1, 4: 2}}
    
    df = pd.DataFrame.from_dict(data)
    
    G = nx.from_pandas_edgelist(df, edge_attr='value')
    colors = [d['value'] for (u, v, d) in G.edges(data=True)]
    nx.draw(G, edge_color=colors, edge_cmap=get_cmap('jet'))
    

    gives

    edge_colored_network_plot