Search code examples
pythonscipyprobability-distribution

Python Scipy add probability distributions


Given multiple probability distributions, is there an optimized function in scipy (or other library) that allows me to add distributions? Consider the simple example:

Suppose I have a 6 sided die and a 20 sided die and I want to know the probability mass function for rolling a 2 through a 26.

from scipy.stats import randint
import pandas as pd
import plotly.express as px
x_points_6 = [x+1 for x in range(6)]
x_points_20 = [x+1 for x in range(20)]

dist_6_sided = [randint.pmf(k,1,7) for k in x_points_6]
dist_20_sided = [randint.pmf(k,1,21) for k in x_points_20]
total_dist = add_dist(x_points_6, dist_6_sided, x_points_20, dist_20_sided)
df = pd.DataFrame({'x':total_dist[0], 'y':total_dist[1]})

I have a function to somewhat brute force the addition of the distributions:

def add_dist(value1, prob1, value2, prob2):
    result_value = []
    result_prob = []
    for prob1_i, value1_i in zip(prob1, value1):
        for prob2_i, value2_i in zip(prob2, value2):
            value = value1_i + value2_i
            prob = prob1_i * prob2_i
            result_value.append(value)
            result_prob.append(prob)
    unique_values = set(result_value)
    count_x = []
    for i in unique_values:
        count_x.append(result_value.count(i))
    return([result_value, result_prob])

And with this, I can get the resulting distribution:

df_added_pdfs = df.groupby(['x']).sum()
fig = px.bar(df_added_pdfs)
fig.show()

enter image description here

I am looking for a solution that can add any of scipy's built-in discrete or continuous distribution functions (not just the simple uniform case). I think I might be missing the correct search term. I would think this would be routine and there would be a function in scipy or numpy to do this. I am looking for a library with a more optimized function.


Solution

  • You can use a convolution of your PDFs:

    from scipy.stats import randint
    from scipy.signal import convolve
    import pandas as pd
    import plotly.express as px
    x_points_6 = [x+1 for x in range(6)]
    x_points_20 = [x+1 for x in range(20)]
    
    dist_6_sided = [randint.pmf(k,1,7) for k in x_points_6]
    dist_20_sided = [randint.pmf(k,1,21) for k in x_points_20]
    
    conv = convolve(dist_6_sided, dist_20_sided)
    df = pd.DataFrame({'x':range(2, 26+1), 'y':conv})
    
    df_added_pdfs = df.groupby(['x']).sum()
    fig = px.bar(df_added_pdfs)
    fig.show()
    
    

    enter image description here