Search code examples
pythonpandasnetworkxpredictionsocial-networking

Node u is not in the graph, common_neighbors networkX


I'm working on link prediction with networkx, and I wanted to know the common neighbours between 2 nodes in my graph to see if they could be linked, and I faced this problem.

So this is my code :

import pandas as pd
import numpy as np
import random
import networkx as nx
from tqdm import tqdm
import re
import matplotlib.pyplot as plt

# load nodes details
with open("/content/drive/MyDrive/fb-pages-food.nodes") as f:
    fb_nodes = f.read().splitlines() 

# load edges (or links)
with open("/content/drive/MyDrive/fb-pages-food.edges") as f:
    fb_links = f.read().splitlines() 

len(fb_nodes), len(fb_links)


# capture nodes in 2 separate lists
node_list_1 = []
node_list_2 = []

for i in tqdm(fb_links):
  node_list_1.append(i.split(',')[0])
  node_list_2.append(i.split(',')[1])

fb_df = pd.DataFrame({'node_1': node_list_1, 'node_2': node_list_2})


# create graph
G = nx.from_pandas_edgelist(fb_df, "node_1", "node_2", create_using=nx.Graph())

# plot graph
plt.figure(figsize=(10,10))

pos = nx.random_layout(G, seed=23)
nx.draw(G, with_labels=False,  pos = pos, node_size = 40, alpha = 0.6, width = 0.7)
plt.show()

Here where the problem is:

sorted(nx.common_neighbors(G, 0, 1))
NetworkXError                             Traceback (most recent call last)
<ipython-input-59-93f15d34d0d4> in <module>()
----> 1 sorted(nx.common_neighbors(G, 0, 1))

1 frames
/usr/local/lib/python3.7/dist-packages/networkx/classes/function.py in common_neighbors(G, u, v)
    952     """
    953     if u not in G:
--> 954         raise nx.NetworkXError("u is not in the graph.")
    955     if v not in G:
    956         raise nx.NetworkXError("v is not in the graph.")

NetworkXError: u is not in the graph.

Solution

  • It's not possible to reproduce your error without the same data, but one potential explanation is that the node labels in the raw data are different from integers, e.g. they are are string representations of ints (e.g. '1', '2' vs 1, 2).

    Possible solutions are:

    • check the pandas dataframe and enforce int dtype: fb_df = fb_df.astype('int')
    • relabel the graph nodes with G_new = nx.convert_node_labels_to_integers(G). With this approach you will want to check that the new labels correspond to your expectations, e.g. node '2' could be relabelled to integer 0, see the linked docs for details.