Search code examples
rubyprimesmathematical-optimization

How do I efficiently sieve through a selected range for prime numbers?


I've been working through Project Euler and Sphere Online Judge problems. In this particular problem, I have to find all the prime numbers within two given numbers. I have a function that looks promising (based on the Sieve of Eratosthenes), except it's too slow. Can someone spot what is slowing my function down so much, and hint at how I can fix it? Also, some comments about how to approach optimization in general (or links to such comments/books/articles etc,) would be greatly appreciated.

Code:

def ranged_sieve(l, b)
  primes = (l..b).to_a
  primes[0]=nil if primes[0] < 2
  (2..Math.sqrt(b).to_i).each do |counter|
    step_from = l / counter
    step_from = step_from * counter
    l > 3 ? j = step_from : j = counter + counter
    (j..b).step(counter) do |stepped|
      index = primes.index(stepped)
      primes[index] = nil if index
    end
  end
  primes.compact
end

Solution

  • I haven't looked fully, but one factor is that, you are replacing a certain value in primes with nil, and later compact-ing it to remove them. This is a waste. Just by doing that directly with delete_at makes it more than twice fast:

    def ranged_sieve2(l, b)
      primes = (l..b).to_a
      primes.delete_at(0) if primes[0] < 2
      (2..Math.sqrt(b).to_i).each do |counter|
        step_from = l / counter
        step_from = step_from * counter
        l > 3 ? j = step_from : j = counter + counter
        (j..b).step(counter) do |stepped|
          index = primes.index(stepped)
          primes.delete_at(index) if index
        end
      end
      primes
    end
    
    ranged_sieve(1, 100) # => Took approx 8e-4 seconds on my computer
    ranged_sieve2(1, 100) # => Took approx 3e-4 seconds on my computer
    

    Another point to improve is that, using a hash is much faster than array as the relevant size gets larger. Replacing your array implementation with a hash, you can get this:

    def ranged_sieve3(l, b)
      primes = (l..b).inject({}){|h, i| h[i] = true; h}
      primes.delete(0)
      primes.delete(1)
      (2..Math.sqrt(b).to_i).each do |counter|
        step_from = l / counter
        step_from = step_from * counter
        l > 3 ? j = step_from : j = counter + counter
        (j..b).step(counter) do |stepped|
          primes.delete(stepped)
        end
      end
      primes.keys
    end
    

    When you do range_sieve3(1, 100) with this, it is slower than range_sieve2(1, 100) because of the overhead. But as you make the number larger, the superiority becomes salient. For example, I got this result on my computer:

    ranged_sieve(1, 1000) # => Took 1e-01 secs
    ranged_sieve2(1, 1000) # => Took 3e-02 secs
    ranged_sieve3(1, 1000) # => Took 8e-04 secs