Search code examples
networkxprobability-distribution

Networkx - Get probability p(k) from network


I have plotted the histogram of network (dataframe), with count of 'k' node connections, like so:

import seaborn as sns

parameter ='k'
sns.histplot(network[parameter])

enter image description here

But now I need to create a modular random graph using above group distribution with:

from networkx.generators.community import random_partition_graph

random_partition_graph(sizes, p_in, p_out, seed=None, directed=False)

And, instead of counts, I need this value p(k), which must be passed as p_in.

p_in (float)
probability of edges with in groups

How do I get p(k) from my network?


Solution

  • This is how I would handle what you described. First, you can normalize your histogram such that the integral of the histogram is equal to 1. This can be done by setting the weights argument of your histogram appropriately. This histogram can then be considered the probability distribution of your degrees. Now that you have this probability distribution, i.e. a list of probability (deg_prob in the code) you can randomly sample from it using np.random.choice(np.arange(np.amin(degrees),np.amax(degrees)+1), p=deg_prob, size=N_sampling). From this random sampling, you can then create a random expected_degree_graph by just passing your samples in the w argument. You can then compare the degree distribution of your original graph with the one from your random graph.

    See below for the code and more details:

    import networkx as nx
    from networkx.generators.random_graphs import binomial_graph
    from networkx.generators.degree_seq import expected_degree_graph
    import matplotlib.pyplot as plt
    import numpy as np
    
    fig=plt.figure()
    
    N_nodes=1000
    G=binomial_graph(n=N_nodes, p=0.01, seed=0) #Creating a random graph as data
    degrees = np.array([G.degree(n) for n in G.nodes()])#Computing degrees of nodes
    bins_val=np.arange(np.amin(degrees),np.amax(degrees)+2) #Bins
    
    deg_prob,_,_=plt.hist(degrees,bins=bins_val,align='left',weights=np.ones_like(degrees)/N_nodes,
                          color='tab:orange',alpha=0.3,label='Original distribution')#Histogram
    
    #Sampling from distribution
    N_sampling=500
    random_sampling=np.random.choice(np.arange(np.amin(degrees),np.amax(degrees)+1), p=deg_prob, size=N_sampling)
    
    #Creating random graph from samples
    G_random_sampling=expected_degree_graph(random_sampling,seed=0,selfloops=False)
    degrees_random_sampling = np.array([G_random_sampling.degree(n) for n in G_random_sampling.nodes()])
    
    deg_prob_random_sampling,_,_=plt.hist(degrees_random_sampling,bins=bins_val,align='left',
                                          weights=np.ones_like(degrees_random_sampling)/N_sampling,color='tab:blue',label='Sample distribution',alpha=0.3)
    
    #Plotting both histograms
    plt.xticks(bins_val)
    plt.xlabel('degree')
    plt.ylabel('Prob')
    plt.legend()
    plt.show()
    

    The output then gives:

    enter image description here