Search code examples
c++performancealgorithmoptimizationsieve-of-eratosthenes

Is super long redundant code efficient?


I'm trying to find some primes with the Sieve of the greek guy algorithm. I have some efficiency concerns. Here's the code:

void check_if_prime(unsigned number)
{
    unsigned index = 0;
    while (primes[index] <= std::sqrt(number))
    {
        if (number % primes[index] == 0) return;
        ++index;
    }
    primes.push_back(number);
}

And, because I coded huge 2/3/5/7/11/13 prime wheel, the code is 5795 lines longs.

for (unsigned i = 0; i < selection; ++i)
{
    unsigned multiple = i * 30030;
    if (i!=0) check_if_prime( multiple+1 );
    check_if_prime ( multiple+17 );
    check_if_prime ( multiple+19 );
    check_if_prime ( multiple+23 );
    // ...so on until 30029
}

Optimization flags: -O3, -fexpensive-optimizations, -march=pentium2

25 million primes in 20 minutes with CPU stuck at 50% (no idea why, tried real time priority but it didn't change much). Size of output text file is 256MB (going to change to binary later on).

  • Compilation takes ages! Is it okay? How can I make it faster without compromising efficiency?
  • Is that if statement at the start of for loop OK? I've read if statements take the longest.
  • Anything else concerning the code, not the algorithm? Anything to make it faster? What statements are faster than others?
  • Would even a bigger wheel (up to 510510, not just 30030, hell a lot of lines) compile within a day?

I want to find all primes up to 2^32 and little optimizations would save some hours and electricity. Thank you in advance!

EDIT: not seeking for an algorithm, seeking for code improvement if there can be done any!


Solution

  • Here is what I can say about the performance of your program:

    1. Likely your main problem is the call to std::sqrt(). This is a floating point function that's designed for full precision of the result, and it definitely take quite a few cycles. I bet you'll be much faster if you use this check instead:

      while (primes[index]*primes[index] < number)
      

      That way you are using an integer multiplication which is trivial for modern CPUs.

    2. The if statement at the start of your for() loop is irrelevant to performance. It's not executed nearly enough times. Your inner loop is the while loop within check_if_prime(). That's the one you need to optimize.

    3. I can't see how you are doing output. There are ways to do output that can severely slow you down, but I don't think that's the main issue (if it is an issue at all).

    4. Code size can be an issue: your CPU has an instruction cache with limited capacity. If your 6k lines don't fit into the first level instruction cache, the penalty can be severe. If I were you, I'd reimplement the wheel using data instead of code, i. e.:

      unsigned const wheel[] = {1, 17, 19, 23, ...};    //add all your 6k primes here
      for (unsigned i = 0; i < selection; ++i)
      {
          unsigned multiple = i * 30030;
          for(unsigned j = 0; j < sizeof(wheel)/sizeof(*wheel); j++) {
              check_if_prime(multiple + wheel[j]);
          }
      }