I have to calculate a triple summation (Nested Binomial CDF), the equation is below,
It is a Binomial summation first from k = 0 to C then m = 0 to k and f = 0 to C-k, here s is a function which takes an input between 0 and 1 and gives an output between (0, 1).
I need help with finding an efficient method to execute this in python. For now I am using a triple loop which works fine but isn't efficient for large C's. The triple loop I am currently using is below,
Here 's' is essentially linear but it can take whatever shape needed. Also 'r' is taken as 0.1
import numpy as np
from math import comb
def s(d):
return d * (0.99 - 0.01) + 0.01
S = 0
C = 100
for k in range(C):
for m in range(k):
for f in range(C-k):
S += comb(C, k) * (((0.1)**(k))*((0.9)**(C-k))) * comb(k, m) * comb(C-k, f) * (2**(-C)) * s((f+m)/C)
Is there a more efficient way that I can use?
Edit: s is just a function of the following form, it takes in an input between 0 and 1 and gives output between 0.01 and 0.99 based on its shape. In the example code it is linear, but it can be exponential or whatever.
Edit2: The function 's' can't come out of the first summation, in the equation I provided it could due to a typo. It is now fixed along with inputting chrslg's suggestions.
So that s
is just an affine function. Which allows some optimization
Starting from
S=0
for k in range(C):
for m in range(k):
for f in range(C-k):
S += comb(C, k) * (((0.1)**(k))*((0.9)**(C-k))) * comb(k, m) * comb(C-k, f) * (2**(-C)) * s((f+m)/C)
Note that there is an error in this code: Σ in math notations use boundaries included (and in this kind of formula, it is quite usual to have from 0 to n included
, which means n+1 iterations. Because when you have n socks in a drawer, and take a random number of them, then you can take 0 sock, 1 sock, 2 socks, ... up to n socks, included :D).
So, the code should read
S=0
for k in range(C+1):
for m in range(k+1):
for f in range(C-k+1):
S += comb(C, k) * (((0.1)**(k))*((0.9)**(C-k))) * comb(k, m) * comb(C-k, f) * (2**(-C)) * s((f+m)/C)
We can get out of the loops everything that is not dependent on the index of the loop, as I mentioned in comments
S=0; r=0.1
for k in range(C+1):
Sm=0
for m in range(k+1):
Sf=0
for f in range(C-k+1):
Sf += comb(C-k, f) * s((f+m)/C)
Sm += Sf * comb(k,m) * 2**(-C)
S += Sm*comb(C,k) * r**k * (1-r)**(C-k)
Here, I just mimick the math equation rather than the code. And some operations are already factorized that way.
Sure, I can't remove directly the inner loop like this, because now you've replace s(m/C)
which is independent from f
, by s((m+f)/C)
which is not.
But since s
is affine, we can easily replace it by
S=0
for k in range(C+1):
Sm=0
for m in range(k+1):
Sf=0
for f in range(C-k+1):
Sf += comb(C-k, f) * (0.01 + 0.98*m/C + 0.98/C*f)
Sm += Sf * comb(k,m) * 2**(-C)
S += Sm*comb(C,k) * r**k * (1-r)**(C-k)
So, one constant term, one proportional to f term
And since Σcomb(n,k) is 2ᵏ and Σk.comb(n,k) is n×2ᵏ⁻¹, we can replace this by
S=0
for k in range(C+1):
Sm=0
for m in range(k+1):
Sf= (0.01+0.98*m/C)*2**(C-k) + 0.98/C * (C-k)*2**(C-k-1)
Sm += Sf * comb(k,m) * 2**(-C)
S += Sm*comb(C,k) * r**k * (1-r)**(C-k)
So here's goes one loop! Now we have 2 loops instead of 3. But it's not over yet!
Combining the Sf=
and Sm +=
lines
S=0
for k in range(C+1):
Sm=0
for m in range(k+1):
Sm += comb(k,m) * ((0.01+0.98*m/C)*2**(-k) + 0.98/C * (C-k)*2**(-k-1))
S += Sm*comb(C,k) * r**k * (1-r)**(C-k)
Note that, here again, what we are multiplying by comb(k,m)
is made of constant terms (0.01*2**(-k)
or 0.98/C*(C-k)*2**(-k-1)
are constant terms, as far as m
is concerned), and proportional to m
terms. So, using the same Σcomb(n,k) = 2ᵏ
and Σk.comb(n,k) = k.2ᵏ⁻¹
formulas, we can rewrite again
S=0
for k in range(C+1):
Sm=(0.01*2**(-k) + 0.98/C * (C-k)*2**(-k-1))*2**k + (0.98/C*2**(-k))*k*2**(k-1)
S += Sm*comb(C,k) * r**k * (1-r)**(C-k)
And here's go the second loop! Just one left. Not over yet!
Which can be simplified as
S=0
for k in range(C+1):
Sm=0.01 + 0.49/C*(C-k) + (0.49/C)*k
S += Sm*comb(C,k) * r**k * (1-r)**(C-k)
Or even
S=0
for k in range(C+1):
Sm=0.01 + 0.49 - 0.49/C*k + (0.49/C)*k
S += Sm*comb(C,k) * r**k * (1-r)**(C-k)
So, Sm=0.5 obviously
S=0
for k in range(C+1):
S += 0.5*comb(C,k) * r**k * (1-r)**(C-k)
But Σ comb(n,k) aᵏbⁿ⁻ᵏ
is the well known development of (a+b)ⁿ.
So that is 0.5×(r + (1-r))ᶜ = 0.5×1ᶜ = 0.5
Hence the ultimate simplification of your code (sorry, can't do better than that!)
S=0.5
And you can easily check that, once the boundaries problems corrected, your code also returns 0.5 whatever the parameters (C and r) are.
Since this is supposed to be a numpy question, not a math simplification question, here is how you can accelerate this kind of computation with numpy.
Suppose you are trying to compute the inner loop
for f in range(C-k+1):
Sf += comb(C-k, f) * s((f+m)/C)
(the more interesting one. The one because of which I wanted to know s
, and particularly if it was vectorizable. My initial plan was to give the kind of answer I am giving now, if s
was vectorizable, that is if s
can work with both scalars and arrays as parameters. But I had to change my plan when I saw that s
was even affine, which is even better. But, well, let's suppose that s
is more complicated, for example contains sin
, log
or even just some higher order polynomials that make my formal computation impossible)
If, we could compute, at once, a vector of C-k+1
values, using vectorized functions (again, functions that works on whole arrays, if asked to), then we could skip the for
loop, and sum that with numpy's sum
method. It is still a for
loop in reality. But a hidden one, and, more importantly, one that occurs inside numpy's optimized C code, node in our slow pure python.
We can start with a vector of values for f
f=np.arange(C-k+1.0)
. Note the 1.0
, lazy way to ensure it is float
Good news is your s
(and probably all s
you could think of, even with the sin
, log
or higher degree polynomials I've mentioned) is vecorized. So we can easily compute a vector of s((f+m)/C)
.
That is simply by literally doing that: if f
is a vector, then so is (f+m)/C
and so is s((f+m)/C)
since s
is vectorized.
So, the harder part is the comb part.
With numpy you can do it with a cumprod
f=np.arange(C-k+1.0)
np.seterr(divide='ignore') # Just because the 1st term I am about to compute is 1/0
combRatio=((C-k-f)/f) # That is also a vector, since f is
combRatio[0]=1 # neutral cumprod
mycomb=combRatio.cumprod() # All ΠcombRatio aka comb(f,C-k)
So, now we have a vector of comb(f,C-k)
and a vector of s((f+m)/C)
. To have a vector of all terms, we just need to multiply both
f=np.arange(C-k+1.0)
np.seterr(divide='ignore') # Just because the 1st term I am about to compute is 1/0
combRatio=((C-k-f)/f) # That is also a vector, since f is
combRatio[0]=1 # neutral cumprod
mycomb=combRatio.cumprod() # All ΠcombRatio aka comb(f,C-k)
terms = mycomb * s((f+m)/C)
# And the Σ result is simply
terms.sum()
Note that if you are willing to use scipy, it is even simpler, since scipy contains a vectorized comb
function
f=np.arange(C-k+1.0)
Sf=(scipy.specials.comb(C-k,f) * s((f+m)/C)).sum()
This times, no math tricks from me. I simplified nothing. This is just coding tricks that made the loop disappear. It is still there (inside scipy's comb
, numpy's +
, -
, *
and .sum()
). There are even 5 of them now (non-nested, and way more than 5 times faster than a pure python loop)