Search code examples
pythontime-complexityprimes

Why the optimized prime number generator program take more time?


I recently work on the prime number generator program and try to optimize the time complexity.

  1. My first attempt is simple: iterate from current_big_prime + 1 as current_number and in every iterate check from 1 to current_number-1 to see if current_number is prime.
  2. Then I realize it is sufficient that every check iterate(the inner loop) go until sqrt(current_number)+1 instead of current_number-1, which should save a lot of time.

But the result is opposite: the optimized program is worse on time complexity.

Here is my program:

#%% Prime Streaming (PG-13)
import timeit
import matplotlib.pyplot as plt
import math

class Primes:
    @staticmethod
    def stream1():
        primes = []
        yield 2
        i = 3
        while True:
            is_prime = True
            for j in primes:
                if i % j == 0: 
                    is_prime = False
                    break
            if is_prime:
                yield i
                primes.append(i)
            i += 2
    @staticmethod
    def stream2():
        primes = []
        yield 2
        i = 3
        while True:
            is_prime = True
            for j in filter(lambda x: x < math.sqrt(i) + 1, primes):
                if i % j == 0: 
                    is_prime = False
                    break
            if is_prime:
                yield i
                primes.append(i)
            i += 2

def get_nth_prime1(n):
    s = Primes.stream1()
    cnt = 0
    for i in s:
        cnt += 1
        if cnt > n:
            break
    return i
def get_nth_prime2(n):
    s = Primes.stream2()
    cnt = 0
    for i in s:
        cnt += 1
        if cnt > n:
            break
    return i

s1 = Primes().stream1()
for i in range(20):
    print(next(s1), end=' ')
print()

s2 = Primes().stream2()
for i in range(20):
    print(next(s2), end=' ')
print()

t1 = timeit.timeit("get_nth_prime1(5000)",  
              "from __main__ import get_nth_prime1", number=1)
t2 = timeit.timeit("get_nth_prime2(5000)",  
              "from __main__ import get_nth_prime2", number=1)
print(f"check until n: {t1}\ncheck until sqrt(n): {t2}")

Solution

  • filter doesn't stop early. It does filter out the numbers larger than the square root, but to do that, it still goes through all the numbers and tests them. It just doesn't pass the large ones on to you. So while you do save the cost for your i % j == 0 for the large ones, you pay the larger filter/lambda cost for them instead. And for the small ones you pay it additionally.

    You can instead use takewhile, that does do exactly what you were trying to do. It gives you the small numbers and entirely stops when it reaches the square root.

    In my own testing, your code reported times ~1.7 seconds without filter and ~4.9 seconds with filter. With takewhile, it's ~0.15 seconds. And ~0.08 seconds if you use lambda x: x < sqrt, computing that limit only once, before the loop. A simple if+break as the other answer suggested is still faster, though, as it avoids the filter/lambda overhead.