Search code examples
algorithmprimesprime-factoringsieve-of-eratosthenesfactorization

Fastest way to prime factorise a number up to 10^18


Given a number 1 <= n <= 10^18, how can I factorise it in least time complexity?

There are many posts on the internet addressing how you can find prime factors but none of them (at least from what I've seen) state their benefits, say in a particular situation.

I use Pollard's rho algorithm in addition to Eratosthenes' sieve:

  • Using sieve, find all prime numbers in the first 107 numbers, and then divide n with these primes as much as possible.
  • Now use Pollard's rho algorithm to try and find the rest of the primes until n is equal to 1.

My Implementation:

#include <iostream>
#include <vector>
#include <cstdio>
#include <ctime>
#include <cmath>
#include <cstdlib>
#include <algorithm>
#include <string>

using namespace std;

typedef unsigned long long ull;
typedef long double ld;
typedef pair <ull, int> pui;
#define x first
#define y second
#define mp make_pair

bool prime[10000005];
vector <ull> p;

void initprime(){
    prime[2] = 1;
    for(int i = 3 ; i < 10000005 ; i += 2){
        prime[i] = 1;
    }
    for(int i = 3 ; i * i < 10000005 ; i += 2){
        if(prime[i]){
            for(int j = i * i ; j < 10000005 ; j += 2 * i){
                prime[j] = 0;
            }
        }
    }
    for(int i = 0 ; i < 10000005 ; ++i){
        if(prime[i]){
            p.push_back((ull)i);
        }
    }
}

ull modularpow(ull base, ull exp, ull mod){
    ull ret = 1;
    while(exp){
        if(exp & 1){
            ret = (ret * base) % mod;
        }
        exp >>= 1;
        base = (base * base) % mod;
    }
    return ret;
}

ull gcd(ull x, ull y){
    while(y){
        ull temp = y;
        y = x % y;
        x = temp;
    }
    return x;
}

ull pollardrho(ull n){
    srand(time(NULL));
    if(n == 1)
        return n;
    ull x = (rand() % (n - 2)) + 2;
    ull y = x;
    ull c = (rand() % (n - 1)) + 1;
    ull d = 1;
    while(d == 1){
        x = (modularpow(x, 2, n) + c + n) % n;
        y = (modularpow(y, 2, n) + c + n) % n;
        y = (modularpow(y, 2, n) + c + n) % n;
        d = gcd(abs(x - y), n);
        if(d == n){
            return pollardrho(n);
        }
    }
    return d;
}
int main ()
{
ios_base::sync_with_stdio(false);
cin.tie(0);
initprime();
ull n;
cin >> n;
ull c = n;
vector <pui> o;
for(vector <ull>::iterator i = p.begin() ; i != p.end() ; ++i){
    ull t = *i;
    if(!(n % t)){
        o.push_back(mp(t, 0));
    }
    while(!(n % t)){
        n /= t;
        o[o.size() - 1].y++;
    }
}
while(n > 1){
    ull u = pollardrho(n);
    o.push_back(mp(u, 0));
    while(!(n % u)){
        n /= u;
        o[o.size() - 1].y++;
    }
    if(n < 10000005){
        if(prime[n]){
            o.push_back(mp(n, 1));
        }
    }
}
return 0;
}

Is there any faster way to factor such numbers? If possible, please explain why along with the source code.


Solution

  • Approach

    Lets say you have a number n that goes up to 1018 and you want to prime factorise it. Since this number can be as small as unity and as big as 1018, all along it can be prime as well as composite, this would be my approach -

    1. Using miller rabin primality testing, make sure that the number is composite.
    2. Factorise n using primes up to 106, which can be calculated using sieve of Eratosthenes.

    Now the updated value of n is such that it has prime factors only above 106 and since the value of n can still be as big as 1018, we conclude that the number is either prime or it has exactly two prime factors (not necessarily distinct).

    1. Run Miller Rabin again to ensure the number isn't prime.
    2. Use Pollard rho algorithm to get one prime factor.

    You have the complete factorisation now.

    Lets look at the time-complexity of the above approach:

    • Miller Rabin takes O(log n)
    • Sieve of Eratosthenes takes O(n*log n)
    • The implementation of Pollard rho I shared takes O(n^0.25)

    Time Complexity

    Step 2 takes maximum time which is equal to O(10^7), which is in turn the complexity of the above algorithm. This means you can find the factorisation within a second for almost all programming languages.

    Space Complexity

    Space is used only in the step 2 where sieve is implemented and is equal to O(10^6). Again, very practical for the purpose.

    Implementation

    Complete Code implemented in C++14. The code has a hidden bug. You can either reveal it in the next section, or skip towards the challenge ;)

    Bug in the code

    In line 105, iterate till i<=np. Otherwise, you may miss the cases where prime[np]=999983 is a prime factor

    Challenge

    Give me a value of n, if any, where the shared code results in wrong prime factorisation.

    Bonus

    How many such values of n exist ?

    Hint

    For such value of n, assertion in Line 119 may fail.

    Solution

    Lets call P=999983. All numbers of the form n = p*q*r where p, q, r are primes >= P such that at least one of them is equal to P will result in wrong prime factorisation.

    Bonus Solution

    There are exactly four such numbers: {P03, P02P1, P02P2, P0P12}, where P0 = P = 999983, P1 = next_prime(P0) = 1000003, P2 = next_prime(P1) = 1000033.