Optimize trivial summation in calculation of geometric distribution

def geo_dist_names(p, k):
    sum = 0
    for i in range(1, 4):
        sum += p**i   
    return (p**(1+k))/sum

p is a float between 0 and 1 and k is an int between 0 and 3. The function basically just find the value in a geometric distribution associated with the given p and k and then normalizes this by dividing with the sum of the 4 potential values for k.

It works, but I am calling this function many times so I wondered if there were a more optimized way of performing this operation?

Solution

The vectorial version of your code would be:

import numpy as np

def geo_dist_names(p, k):
    return (p**(1+k))/(p**np.arange(1,4)).sum()

Yet, I'm not sure that it will be faster than pure python as the range is quite small here, so the overhead of numpy is probably not negligible.

Edit. Indeed, assuming:

def geo_dist_names_python(p, k, N=4):
    sum = 0
    for i in range(1, N):
        sum += p**i   
    return (p**(1+k))/sum

def geo_dist_names_numpy(p, k, N=4):
    return (p**(1+k))/(p**np.arange(1,N)).sum()

numpy is better only when the range increases: