Search code examples
pythonpandasnetworkxdirected-graph

How to convert a dataframe into a list of 3-tuples


I would like to create a directed graph with use of the networkx library in python.

I have a pandas dataframe that looks like this:

                                 Head Mounted Display  Marker  Smartphone
    2D data extrusion                               3       0           1   
    AgiSoft PhotoScan 3D design                     1       2           2   
    AuGeo Esri AR template                          1       1           2   
    BIM                                             1       1           0   
    Blender 3D design                               0       2           4   
    Bluetooth localization                          1       1           0   
    CityEngine                                      3       1           2   
    GIS data processing                             3       1           2   
    GNSS localization                               1       2           4   
    Google ARCore                                   0       1           5   
    Google SketchUp 3D design                       1       2           0   
    Image Stitching                                 1       1           4   
    Java Development Kit                            0       1           0   
    SLAM                                            1       2           2   
    Unity 3D                                        8      12          10   
    Unreal Engine                                   1       1           0   
    Vuforia                                         2       7           3

As input for the "networkx.DiGraph.add_weighted_edges_from" function I need to format this in a list of 3-tuples like this:


('Head Mounted Display', '2D data extrusion', 3),
('Head Mounted Display', 'Agisoft PhotoScan 3D design', 1),
('Head Mounted Display','AuGeo Esri AR template', 1),
etc...

Furthermore, tuples that have a weight of 0 such as:

('Marker', '2D data extrusion', 0)

need to be removed from the list.

Anyone any idea how to do this?

Thanks in advance!


Solution

  • Similar to the answer by @SultanOrazbayev you can melt the dataframe, but you can utilize the nx.from_pandas_edgelist function to directly use the melted dataframe without having to create the list of tuples.

    # Sample df
    df = pd.DataFrame({'Head Mounted Display':[3,1,1,1,0],'Marker':[0,2,1,1,2],'Smartphone':[1,2,2,0,4]})
    # melt the dataframe and filter out the rows with weight of zero
    df_long_temp = df.reset_index().melt(id_vars='index',var_name='to',value_name='weight')
    df_long = df_long_temp[df_long_temp['weight'] != 0]
    
    # create the graph with edge weights
    g = nx.from_pandas_edgelist(df_long,source='index',target='to',
                            edge_attr='weight',create_using=nx.DiGraph)
    
    # drawing the graph
    pos = nx.spring_layout(g)
    nx.draw_networkx(g,pos=pos)
    weight_dict = {(u,v):'w={}'.format(w) for u,v,w in g.edges(data='weight')}
    nx.draw_networkx_edge_labels(g,pos=pos,edge_labels=weight_dict)
    

    graph with edge weights