Search code examples
matlabcombinationsint64symbolic-computationprecision

Matlab nchoosek got difference answer using int64 and sym


This is a question about the function nchoosek in Matlab.

I want to find nchoosek(54,25), which is the same as 54C25. Since the answer is about 10^15, I originally use int64. However the answer is wrong with respect to the symbolic one.

Input:

nchoosek(int64(54),int64(25))
nchoosek(sym(54),sym(25))

Output:

1683191473897753
1683191473897752

You can see that they differ by one. This is not really an urgent problem since I now use sym. However can someone tell me why this happens?


EDIT:

I am using R2013a.

I take a look at the nchoosek.m, and find that if the input are in int64, the code can be simplified into

function c = nchoosek2(v,k)

    n = v;  % rename v to be n. the algorithm is more readable this way.

    classOut = 'int64';
    nd = double(n);
    kd = double(k);
    nums = (nd-kd+1):nd;
    dens = 1:kd;
    nums = nums./dens;      %%
    c = round(prod(nums));
    c = cast(c,classOut);
end

However, the outcome of int64(prod(nums./dens)) is different from prod(sym(nums)./sym(dens)) for me. Is this the same for everyone?


Solution

  • In 2013a this can be reproduced...

    There is as @Amro shows a special case in nchoosek for classOut of int64 or unit64,
    however in 2013a this is only applied when the answer is between

    • flintmax (with no argument) and
    • double(intmax(classOut)) + 2*eps(double(intmax(classOut)))

    which for int64 gives 9007199254740992 & 9223372036854775808, which the solution does not lie between...


    If the solution had fallen between these values it would be recalculated using the subfunction binCoef for which the help states: For integers, compute N!/((N-K)! M!) using prime factor cancellations

    The binCoef function would have produced the right answer for the given int64 inputs

    In 2013a with these inputs binCoef is not called

    Instead the "default" pascals triangle method is used in which:

    • Inputs are cast to double
    • The product of the vector ((n-k+1):n)./(1:k) is taken
    • this vector contains k double representations of fractions.

    So what we have is almost certainly floating point error.


    What can be done?

    Two options I can see;

    1. Make your own function based on the code in binCoef,
    2. Modify nchoosek and remove && c >= flintmax from line 81

    Removing this expression will force Matlab to use the more accurate integer based calculation for inputs of int64 and uint64 for any values within their precision. This will be slightly slower but will avoid floating point errors, which are rightfully unexpected when working with integer types.

    Option one - should be fairly straight forward...

    Option two - I recommend keeping an unchanged backup of the original function, or makeing a copy of the function with the modification and use that instead.