I'm looking to compute something like:
Where f(i)
is a function that returns a real number in [-1,1]
for any i
in {1,2,...,5000}
.
Obviously, the result of the sum is somewhere in [-1,1]
, but when I can't seem to be able to compute it in Python using straight forward coding, as 0.55000
becomes 0
and comb(5000,2000)
becomes inf
, which result in the computed sum turning into NaN
.
The required solution is to use log on both sides.
That is using the identity a × b = 2log(a) + log(b)
, if I could compute log(a)
and log(b)
I could compute the sum, even if a
is big and b
is almost 0
.
So I guess what I'm asking is if there's an easy way of computing
log2(scipy.misc.comb(5000,2000))
So I could compute my sum simply by
sum([2**(log2comb(5000,i)-5000) * f(i) for i in range(1,5000) ])
@abarnert's solution, while working for the 5000 figure addresses the problem by increasing the precision in which the comb is computed. This works for this example, but doesn't scale, as the memory required would significantly increase if instead of 5000 we had 1e7 for example.
Currently, I'm using a workaround which is ugly, but keeps memory consumption low:
log2(comb(5000,2000)) = sum([log2 (x) for x in 1:5000])-sum([log2 (x) for x in 1:2000])-sum([log2 (x) for x in 1:3000])
Is there a way of doing so in a readable expression?
The sum
is the expectation of f
with respect to a binomial distribution with n = 5000
and p = 0.5
.
You can compute this with scipy.stats.binom.expect:
import scipy.stats as stats
def f(i):
return i
n, p = 5000, 0.5
print(stats.binom.expect(f, (n, p), lb=0, ub=n))
# 2499.99999997
Also note that as n
goes to infinity, with p
fixed, the binomial distribution approaches the normal distribution with mean np
and variance np*(1-p)
. Therefore, for large n
you can instead compute:
import math
print(stats.norm.expect(f, loc=n*p, scale=math.sqrt((n*p*(1-p))), lb=0, ub=n))
# 2500.0