I'm trying to find some primes with the Sieve of the greek guy algorithm. I have some efficiency concerns. Here's the code:
void check_if_prime(unsigned number)
{
unsigned index = 0;
while (primes[index] <= std::sqrt(number))
{
if (number % primes[index] == 0) return;
++index;
}
primes.push_back(number);
}
And, because I coded huge 2/3/5/7/11/13 prime wheel, the code is 5795 lines longs.
for (unsigned i = 0; i < selection; ++i)
{
unsigned multiple = i * 30030;
if (i!=0) check_if_prime( multiple+1 );
check_if_prime ( multiple+17 );
check_if_prime ( multiple+19 );
check_if_prime ( multiple+23 );
// ...so on until 30029
}
Optimization flags: -O3, -fexpensive-optimizations, -march=pentium2
25 million primes in 20 minutes with CPU stuck at 50% (no idea why, tried real time priority but it didn't change much). Size of output text file is 256MB (going to change to binary later on).
I want to find all primes up to 2^32 and little optimizations would save some hours and electricity. Thank you in advance!
EDIT: not seeking for an algorithm, seeking for code improvement if there can be done any!
Here is what I can say about the performance of your program:
Likely your main problem is the call to std::sqrt()
. This is a floating point function that's designed for full precision of the result, and it definitely take quite a few cycles. I bet you'll be much faster if you use this check instead:
while (primes[index]*primes[index] < number)
That way you are using an integer multiplication which is trivial for modern CPUs.
The if statement at the start of your for()
loop is irrelevant to performance. It's not executed nearly enough times. Your inner loop is the while loop within check_if_prime()
. That's the one you need to optimize.
I can't see how you are doing output. There are ways to do output that can severely slow you down, but I don't think that's the main issue (if it is an issue at all).
Code size can be an issue: your CPU has an instruction cache with limited capacity. If your 6k lines don't fit into the first level instruction cache, the penalty can be severe. If I were you, I'd reimplement the wheel using data instead of code, i. e.:
unsigned const wheel[] = {1, 17, 19, 23, ...}; //add all your 6k primes here
for (unsigned i = 0; i < selection; ++i)
{
unsigned multiple = i * 30030;
for(unsigned j = 0; j < sizeof(wheel)/sizeof(*wheel); j++) {
check_if_prime(multiple + wheel[j]);
}
}