Search code examples
pythonpandasnetworkxsocial-networking

How to making edges from dataframe in NX


I have a dataframe that contains movies, actors names etc. And it has 41k rows. enter image description here

I'm planning to make a graph from NX library and I want to use actors as nodes, and make edges if they are played in a same movie. I tried to make it dataframe and do it with for loops but I couldn't. Can you help me?

Edit: I want to make a graph like this: enter image description here


Solution

  • IIUC, let's try something like this using networkx and itertools libraries:

    from itertools import tee
    import networkx as nx
    import pandas as pd
    
    df = pd.DataFrame({'Movie': [*'AAABBCCCDD'],
                      'Actor':[1,2,3,2,5,7,8,9,10,8]})
    
    def pairwise(iterable):
        "s -> (s0,s1), (s1,s2), (s2, s3), ..."
        a, b = tee(iterable)
        next(b, None)
        return zip(a, b)
    
    
    G = nx.Graph()
    for _, s in df.groupby('Movie'):
        if s.shape[0] > 1:
            [G.add_edge(*i) for i in pairwise(s['Actor'])]
        else:
            G.add_node(s['Actor'].iloc[0])
            
    nx.draw_networkx(G)
    
    [list(i) for i in nx.connected_components(G)]
    

    Output:

    enter image description here

    And, actor groups:

    [[1, 2, 3, 5], [8, 9, 10, 7]]