This is a question about the function nchoosek
in Matlab.
I want to find nchoosek(54,25)
, which is the same as 54C25. Since the answer is about 10^15, I originally use int64
. However the answer is wrong with respect to the symbolic one.
Input:
nchoosek(int64(54),int64(25))
nchoosek(sym(54),sym(25))
Output:
1683191473897753
1683191473897752
You can see that they differ by one. This is not really an urgent problem since I now use sym
. However can someone tell me why this happens?
I am using R2013a.
I take a look at the nchoosek.m
, and find that if the input are in int64
, the code can be simplified into
function c = nchoosek2(v,k)
n = v; % rename v to be n. the algorithm is more readable this way.
classOut = 'int64';
nd = double(n);
kd = double(k);
nums = (nd-kd+1):nd;
dens = 1:kd;
nums = nums./dens; %%
c = round(prod(nums));
c = cast(c,classOut);
end
However, the outcome of int64(prod(nums./dens))
is different from prod(sym(nums)./sym(dens))
for me. Is this the same for everyone?
There is as @Amro shows a special case in nchoosek
for classOut of int64 or unit64,
however in 2013a this is only applied when the answer is between
flintmax
(with no argument) anddouble(intmax(classOut)) + 2*eps(double(intmax(classOut)))
which for int64
gives 9007199254740992 & 9223372036854775808, which the solution does not lie between...
If the solution had fallen between these values it would be recalculated using the subfunction binCoef
for which the help states: For integers, compute N!/((N-K)! M!) using prime factor cancellations
The binCoef
function would have produced the right answer for the given int64 inputs
binCoef
is not calledInstead the "default" pascals triangle method is used in which:
((n-k+1):n)./(1:k)
is takenk
double representations of fractions.So what we have is almost certainly floating point error.
Two options I can see;
&& c >= flintmax
from line 81Removing this expression will force Matlab to use the more accurate integer based calculation for inputs of int64 and uint64 for any values within their precision. This will be slightly slower but will avoid floating point errors, which are rightfully unexpected when working with integer types.
Option one - should be fairly straight forward...
Option two - I recommend keeping an unchanged backup of the original function, or makeing a copy of the function with the modification and use that instead.