Search code examples
pythonpython-3.xmathprimessieve-algorithm

Sieve of Erosthenes much slower when called as function in Python


I have two blocks of code, both of which I have written to apply the sieve of eratosthenes to sum all primes up to 2000000. This first block, which is just raw code not wrapped in any function, is this:

N = 2000000
is_prime = (N + 1) * [True]

for candidate in range(2, N + 1):
    if is_prime[candidate]:
        print(candidate)
        for witness in range(2 * candidate, N + 1, candidate):
            is_prime[witness] = False

The second block of code has split this functionality into a function which check for primality, and then a for loop which specifies the upper bound. It is as follows:

  def is_prime(n):
  is_prime = (n + 1) * [True]

  for candidate in range(2, int(sqrt(n)) + 1):
      if is_prime[candidate]:
          for witness in range(2 * candidate, n+1, candidate):
              is_prime[witness] = False

  return is_prime[n]

for candidate in range(2, LIMIT):
    if is_prime(candidate):
        print(candidate)

However, the block of code split into the function which checks primality is infinitely slower. I cannot for the life of me figure out what the difference between these blocks of code is. What am I doing wrong?


Solution

  • Your second implementation keeps the list is_prime as a local. At every function invocation it "restarts" the computation by initialising the list to (n + 1) * [True].

    So by restarting the work you basically do N times the work when using your second implementation.

    Edit: as @Aaron correctly pointed out in the comments also your call to print() makes the second version slower.

    Problems

    Summarizing there are the following problems:

    • The implementation using a function restarts its work
    • The second implementation does a print. The first one does not which is obviously faster.
    • Just as a side note: your is_prime list has the same name as your function. This will cause trouble for example when using recursion.

    Improvements

    As a very simple improvement you could try to (rename and) move the is_prime list into a global variable. Then when is_prime(n) is called with a number that is not yet in the list, you extend the list (e.g. some_list += difference * [True]) and only compute the difference.