data-science networkx graph-theory bipartite

Get node weight for a simple bipartite graph

I've created a bipartite networkx graph from a CSV file that maps Disorders to Symptoms. So, a disorder may be linked to one or more Symptoms.

for disorder, symptoms in csv_dictionary.items():
    for i in range (0, len(symptoms)):
        G.add_edge(disorder, symptoms[i])

What I need is to find what Symptoms are connected to multiple diseases and sort them according to their weight. Any suggestions?

Solution

You can use degree of the created graph. Every symptom with degree larger than 1 belongs to at least two diseases:

I've added some example csv_dictionary (please supply it in your next question as minimal reproducible example) and created a set of all symptoms during the creation of the graph. You could also think about adding these information as node feature to the graph.

import networkx as nx

csv_dictionary = {"a": ["A"], "b": ["B"], "c": ["A", "C"], "d": ["D"], "e": ["E", "B"], "f":["F"], "g":["F"], "h":["F"]}

G = nx.Graph()

all_symptoms = set()
for disorder, symptoms in csv_dictionary.items():
    for i in range (0, len(symptoms)):
        G.add_edge(disorder, symptoms[i])

        all_symptoms.add(symptoms[i])

symptoms_with_multiple_diseases = [symptom for symptom in all_symptoms if G.degree(symptom) > 1]
print(symptoms_with_multiple_diseases)
# ['B', 'F', 'A']

sorted_symptoms = list(sorted(symptoms_with_multiple_diseases, key= lambda symptom: G.degree(symptom)))
print(sorted_symptoms)
# ['B', 'A', 'F']