I'm trying to use the Sieve of Eratosthenes' algorithm in Haskell to generate a stream of primes but the code doesn't seems to work. This is the main idea.
The main idea is derived from our functional programming lectures, which in turn refer to Melissa O’Neil's paper, "The Genuine Sieve of Eratosthenes," JFC, 2009, which discussed what the sieve of Eratosthenes truly is in Haskell.
union :: Ord a => [a] -> [a] -> [a]
union xs@(x:txs) ys@(y:tys)
| x < y = x:union txs ys
| x > y = y:union xs tys
-- se sono uguali, ne scrivo uno solo
| otherwise = x:union txs tys
minus :: Ord a => [a] -> [a] -> [a]
minus xs@(x:txs) ys@(y:tys)
| x < y = x : minus txs ys
| x > y = y : minus xs tys
| otherwise = x : minus xs ys
unionP :: Ord a => [a] -> [a] -> [a]
unionP (x:xs) ys = x:union xs ys -- poi si va con union
unionAll :: [[Integer]] -> [Integer]
unionAll = foldr1 unionP
primes :: [Integer]
primes = 2:[3..] `minus` composites
where composites = unionAll [map (p*) [p..] | p <-primes]
-- Test
firstPrimes = take 15 primes
This is what actually happened, it doesn't go beyond this:
ghci> firstPrimes [2
So i had to interrupt the process:
ghci> firstPrimes [2^CInterrupted.
ghci>
First, your implementation of minus
is wrong. O'Neill's paper uses:
(x:xs) `minus` (y:ys) | x < y = x:(xs `minus` (y:ys))
| x == y = xs `minus` ys
| x > y = (x:xs) `minus` ys
Converting to your notation, it should look like this:
minus :: Ord a => [a] -> [a] -> [a]
minus xs@(x:txs) ys@(y:tys)
| x < y = x : minus txs ys
| x > y = minus xs tys
| otherwise = minus txs ys
Second, the definition of union
in O'Neill's paper uses foldr
not foldr1
, so it's equivalent to:
unionAll = foldr unionP []
instead of:
unionAll = foldr1 unionP
That may not seem like a big deal, but simple definitions of these functions look like this:
foldr f (x:xs) = f x (foldr f xs)
foldr f [] = []
foldr1 f [x] = [x]
foldr1 f (x:xs) = f x (foldr1 f xs)
See how, for foldr1
, the case for [x]
comes before the case for (x:xs)
? For an infinite input list, the case [x]
never occurs, but Haskell still checks the case [x]
before handling the case (x:xs)
. This makes foldr1
slightly less lazy than foldr
.
For example:
λ> head $ foldr unionP [] ([1]:undefined)
1
λ> head $ foldr1 unionP ([1]:undefined)
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
undefined, called at <interactive>:86:27 in interactive:Ghci4
In your composites
call, where unionAll
is called on the infinite list of lists:
bigList = [map (p*) [p..] | p <- primes]
because of the way foldr1
is defined, Haskell can't actually return the first element from unionAll bigList
until it has checked that bigList
has more than one element, but this requires ensuring that there's a second element in the list:
primes = 2 : ([3..] `minus` composites)
and this requires checking that the following list is non-empty:
[3..] `minus` composites
and that requires making sure that composites
starts with a number greater than 3
, but that requires evaluating:
unionAll bigList
which is what started the whole computation, and you get an infinite loop.
Anyway, all you need to do to fix this is replace:
unionAll = foldr1 unionP
with the slightly lazier:
unionAll = foldr unionP []
and it should work.
Here's a full working version of your code with the corrections to minus
and unionAll
that generates primes without getting stuck:
union :: Ord a => [a] -> [a] -> [a]
union xs@(x:txs) ys@(y:tys)
| x < y = x:union txs ys
| x > y = y:union xs tys
| otherwise = x:union txs tys
minus :: Ord a => [a] -> [a] -> [a]
minus xs@(x:txs) ys@(y:tys)
| x < y = x : minus txs ys
| x > y = minus xs tys
| otherwise = minus txs ys
unionP :: Ord a => [a] -> [a] -> [a]
unionP (x:xs) ys = x:union xs ys -- poi si va con union
unionAll :: [[Integer]] -> [Integer]
unionAll = foldr unionP []
primes :: [Integer]
primes = 2:[3..] `minus` composites
where composites = unionAll [map (p*) [p..] | p <- primes]
-- Test
firstPrimes = take 15 primes