Search code examples
c++performancealgorithmcombinations

Efficient implementation of Faulhaber's Formula


I want an efficient implementation of Faulhaber's Formula

I want answer as

F(N,K) % P

where F(N,K) is implementation of faulhaber's forumula and P is a prime number.

Note: N is very large upto 10^16 and K is upto 3000

I tried the double series implementation in the given site. But its too much time consuming for very large n and k. Can any one help making this implementation more efficient or describe some other way to implement the formula.


Solution

  • How about using Schultz' (1980) idea, outlined below the double series implementation (mathworld.wolfram.com/PowerSum.html) that you mentioned?

    From Wolfram MathWorld:

        Schultz (1980) showed that the sum S_p(n) can be found by writing

        enter image description here

        and solving the system of p+1 equations

        enter image description here

       obtained for j=0, 1, ..., p (Guo and Qi 1999), where delta (j,p) is the Kronecker delta.

    Below is an attempt in Haskell that seems to work. It returns a result for n=10^16, p=1000 in about 36 seconds on my old laptop PC.

    {-# OPTIONS_GHC -O2 #-}
    
    import Math.Combinatorics.Exact.Binomial
    import Data.Ratio
    import Data.List (foldl')
    
    kroneckerDelta a b | a == b    = 1 % 1
                       | otherwise = 0 % 1
    
    g a b = ((-1)^(a - b +1) * choose a b) % 1
    
    coefficients :: Integral a => a -> a -> [Ratio a] -> [Ratio a]
    coefficients p j cs
      | j < 0     = cs
      | otherwise = coefficients p (j - 1) (c:cs)
     where
       c = f / g (j + 1) j
       f = foldl h (kroneckerDelta j p) (zip [j + 2..p + 1] cs)
       h accum (i,cf) = accum - g i j * cf
    
    trim r = let n = numerator r
                 d = denominator r
                 l = div n d
             in (mod l (10^9 + 7),(n - d * l) % d)
    
    s n p = numerator (a % 1 + b) where
     (a,b) = foldl' (\(i',r') (i,r) -> (mod (i' + i) (10^9 + 7),r' + r)) (0,0) 
          (zipWith (\c i ->  trim (c * n^i)) (coefficients p p []) [1..p + 1])
    
    main = print (s (10^16) 1000)