I've been working through Project Euler and Sphere Online Judge problems. In this particular problem, I have to find all the prime numbers within two given numbers. I have a function that looks promising (based on the Sieve of Eratosthenes), except it's too slow. Can someone spot what is slowing my function down so much, and hint at how I can fix it? Also, some comments about how to approach optimization in general (or links to such comments/books/articles etc,) would be greatly appreciated.
Code:
def ranged_sieve(l, b)
primes = (l..b).to_a
primes[0]=nil if primes[0] < 2
(2..Math.sqrt(b).to_i).each do |counter|
step_from = l / counter
step_from = step_from * counter
l > 3 ? j = step_from : j = counter + counter
(j..b).step(counter) do |stepped|
index = primes.index(stepped)
primes[index] = nil if index
end
end
primes.compact
end
I haven't looked fully, but one factor is that, you are replacing a certain value in primes
with nil
, and later compact
-ing it to remove them. This is a waste. Just by doing that directly with delete_at
makes it more than twice fast:
def ranged_sieve2(l, b)
primes = (l..b).to_a
primes.delete_at(0) if primes[0] < 2
(2..Math.sqrt(b).to_i).each do |counter|
step_from = l / counter
step_from = step_from * counter
l > 3 ? j = step_from : j = counter + counter
(j..b).step(counter) do |stepped|
index = primes.index(stepped)
primes.delete_at(index) if index
end
end
primes
end
ranged_sieve(1, 100) # => Took approx 8e-4 seconds on my computer
ranged_sieve2(1, 100) # => Took approx 3e-4 seconds on my computer
Another point to improve is that, using a hash is much faster than array as the relevant size gets larger. Replacing your array implementation with a hash, you can get this:
def ranged_sieve3(l, b)
primes = (l..b).inject({}){|h, i| h[i] = true; h}
primes.delete(0)
primes.delete(1)
(2..Math.sqrt(b).to_i).each do |counter|
step_from = l / counter
step_from = step_from * counter
l > 3 ? j = step_from : j = counter + counter
(j..b).step(counter) do |stepped|
primes.delete(stepped)
end
end
primes.keys
end
When you do range_sieve3(1, 100)
with this, it is slower than range_sieve2(1, 100)
because of the overhead. But as you make the number larger, the superiority becomes salient. For example, I got this result on my computer:
ranged_sieve(1, 1000) # => Took 1e-01 secs
ranged_sieve2(1, 1000) # => Took 3e-02 secs
ranged_sieve3(1, 1000) # => Took 8e-04 secs