Search code examples
pythonalgorithmprimespython-itertoolssieve-of-eratosthenes

Merged iterators produce obscure results


I'm trying to implement prime number generator using Sieve of Eratosthenes algorithm. I do it just to try using recursive iterator merging to implement sifter.

What I do is this:

from itertools import count,islice,groupby
from heapq import merge


def primes3():
    p = 2
    yield p
    sifter = (i*p for i in count(p))
    s = next(sifter)
    for p in count(p+1):
        if p==s: # this p is sieved out
            print('s: {}'.format(s))
            s = next(sifter)
        else:
            yield p # this is prime
            print('p: {}'.format(p))
            sifter = (k for k, g in groupby(merge(sifter,(i*p for i in count(p))))) #add this serie to the sifter: p*p, p*(p+1), p*(p+2), ...


print(list(islice(primes3(),10)))

The output is:

p: 3
s: 4
p: 5
p: 6
p: 7
p: 8
p: 9
p: 10
p: 11
s: 12
[2, 3, 5, 6, 7, 8, 9, 10, 11, 13]

The first number to be sieved out is 4. But the next is 12, not 6 as it should be. Why is that? I checked it with the following code:

>>> sifter = (i*2 for i in count(2))
>>> next(sifter)
4
>>> sifter = (k for k, g in groupby(merge(sifter,(i*3 for i in count(3)))))
>>> print(list(islice(sifter,20)))
[6, 8, 9, 10, 12, 14, 15, 16, 18, 20, 21, 22, 24, 26, 27, 28, 30, 32, 33, 34]

So, as we may see, in test conditions sifter yields the correct results.

Where am I making a mistake? I think it must be something trivial that I just don't see.


Solution

  • I have to agree, this stuff can sometimes be very confusing (but a nice puzzle!).

    Turns out that your sifter iterator changes when the value of p changes (by the way, I'm using python 2.6.5 to test this, not python 3, so print syntax is a bit different):

    >> p = 2
    >> sifter = (i*p for i in count(p))
    >> for x in range(3):
    >>    print next(sifter)
    4
    6
    8
    >>> # now lets see what happens when we change p
    >>> p = 3
    >>> for x in range(3):
    >>>     print next(sifter)
    15
    18
    21
    

    So, the i*p part of the iterator is evaluated with the new of p as soon as p has been updated. An p is indeed updated in your mainloop, which is why you don't get the expected result.

    There is an easy way to solve this: create a function to generate the sifter iterator such that the iterator isn't bounded to p:

    def gensifter(x):
        return (i*x for i in count(x))
    

    and put this in your code (again, I converted it to python 2.6.5):

    def primes3():
        p = 2
        yield p
        sifter = gensifter(p) # <-- changed!
        s = next(sifter)
        for p in count(p+1):
            #print '(testing p = %d\ts = %d)' % (p, s)
            if p==s: # this p is sieved out
                print 's:', s
                s = next(sifter)
            else:
                yield p # this is prime
                print 'p:', p
                sifter = (k for k, g in groupby(merge(sifter,gensifter(p)))) # <-- changed!
    

    Let's see the result now:

    >>> print list(islice(primes3(), 10))
    p: 3
    s: 4
    p: 5
    s: 6
    p: 7
    s: 8
    s: 9
    s: 10
    p: 11
    s: 12
    p: 13
    s: 14
    s: 15
    s: 16
    p: 17
    s: 18
    p: 19
    s: 20
    s: 21
    s: 22
    p: 23
    s: 24
    s: 25
    s: 26
    s: 27
    s: 28
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
    

    Primenumbers galore!