I was curious if there was a good way to do this. My current code is something like:
def factorialMod(n, modulus):
ans=1
for i in range(1,n+1):
ans = ans * i % modulus
return ans % modulus
But it seems quite slow!
I also can't calculate n! and then apply the prime modulus because sometimes n is so large that n! is just not feasible to calculate explicitly.
I also came across http://en.wikipedia.org/wiki/Stirling%27s_approximation and wonder if this can be used at all here in some way?
Or, how might I create a recursive, memoized function in C++?
Expanding my comment to an answer:
Yes, there are more efficient ways to do this. But they are extremely messy.
So unless you really need that extra performance, I don't suggest to try to implement these.
The key is to note that the modulus (which is essentially a division) is going to be the bottleneck operation. Fortunately, there are some very fast algorithms that allow you to perform modulus over the same number many times.
These methods are fast because they essentially eliminate the modulus.
Those methods alone should give you a moderate speedup. To be truly efficient, you may need to unroll the loop to allow for better IPC:
Something like this:
ans0 = 1
ans1 = 1
for i in range(1,(n+1) / 2):
ans0 = ans0 * (2*i + 0) % modulus
ans1 = ans1 * (2*i + 1) % modulus
return ans0 * ans1 % modulus
but taking into account for an odd # of iterations and combining it with one of the methods I linked to above.
Some may argue that loop-unrolling should be left to the compiler. I will counter-argue that compilers are currently not smart enough to unroll this particular loop. Have a closer look and you will see why.
Note that although my answer is language-agnostic, it is meant primarily for C or C++.