Search code examples
pythondataframepoisson

How to calculate expected values for a column using Poisson distribution & then compare with actual values?


I have a dataframe which contains the results of different games played. I need to calculate the expected results(how many games result with the same score) with Poisson distribution then compare actual results with expected results.So, imagine I have 2 games that resulted in result = 2, 4 games resulted in result = 9 and so on. I need expected results corresponding to actual values in terms of number of games resulted in a certain result.

I calculated the mean of the results column which I read also is called the expected value. Plotted a histogram of actual results.

import pandas as pd
import numpy as np

# Game Results DataFrame
game_results = pd.DataFrame({"game_id":[56,57,58,59,60],"result":[0,9,4,6,8]})
print(game_results)

# Histogram for result column

result = game_results["result"]

plt.hist(result)
plt.xlabel("Result")
plt.ylabel("Number of Games")
plt.title("Result Histogram")

lamb = result.mean()

Solution

  • You can draw a random poisson distribution using np.random.poisson with your mean and number of observations i.e. len(game_results):

    import numpy as np
    
    game_results = pd.DataFrame({"game_id":[56,57,58,59,60],"result":[0,9,4,6,8]})
    # Get the lambda
    lamb = result.mean()
    # Draw a random poisson distribution using the lambda
    game_results["expected"] = np.random.poisson(lamb, len(game_results))