Search code examples
pythonstatisticspoisson

create number of requests for a day having Poission arrival on an hourly basis in python


let's say we have a service to which # of requests are coming and we are adding those requests on an hourly basis like from 12-1 and 1-2 etc. So what I want to do is to generate these number of requests which follow Poisson arrival and then add this to a dictionary representing a day of week

 monday = [hour_range, number_of_clients_in_that_hour]

Then at the end, we will have these 7 dictionaries named from Mon to Sunday and on which some linear regression can be used to predict # of clients for next hour of a given day.

So basically, as I am simulating this scenario in python, I need to make an arrival which will represent this kind of scenario. I have following code, using which I generate # of clients in an hour using uniform distribution. how can I do it for Poisson arrival or any other arrival which truly represents such scenario? My code is as follow

day_names = ['mon','tue','wed','thurs','fri','sat','sun']

time_values = np.linspace(1,23,23,dtype='int') # print from 1,2...23

for day_iterator in range(1,7+1): 

     number_of_clients = [] # create empty list that will hold number of clients
        for i in range(1,24,1): #  lets create no. of clients for a day on an hourly basis in this for loop
            rand_value =  random.randint(1,20) # generate number of clients

            number_of_clients.append(rand_value)  # append the number of clients to this list

        # a single day data is generated after this for
        locals() [day_names[day_iterator-1]] = dict(zip(time_values,number_of_clients)) # create dict for each day of a week


    # print each day
    print "monday = %s"%mon
    print "tuesday = %s"%tue
    print "wed = %s"%wed
    print "thurs = %s"%thurs
    print "fri = %s"%fri
    print "sat = %s"%sat
    print "sun = %s"%sun

    plt.plot(mon.keys(),mon.values())

Solution

  • The path of least resistance is to use the built-in Poisson generator from numpy. However, if you want to roll your own the following code will do the trick:

    import math
    import random
    
    def poisson(rate):
        x = 0
        product = random.random()
        threshold = math.exp(-rate)
        while product >= threshold:
            product *= random.random()
            x += 1
    
        return x
    

    This is based on the fact that Poisson events have exponentially distributed interarrival times, so you can generate exponentials until their sum exceeds your specified rate. This implementation is slightly more clever though—by exponentiating both sides of the summation/threshold relationship, the sum of logarithmic evaluations turns into simple multiplication, and the result can be compared to a pre-calculated exponentiated threshold. This is algebraically identical to the sum of exponential random variates but it performs a single exponentiation and an average of lambda multiplications, rather than summing an average of lambda log evaluations.

    Finally, whichever generator you use you need to know the rate. Bearing in mind that poisson is the French word for fish, one of the worst jokes in prob & stats is the statement "the Poisson scales." This means that the hourly rate can be converted to a daily rate by simply multiplying by 24, the number of hours in a day. For example, if you have an average of 3 per hour, you will have an average of 72 per day.