haskell list-comprehension sieve-algorithm

Sieve of Sundaram - list comprehension

I am trying to write a function that calculates all odd prime numbers from 1..n using the "Sieve of Sundaram" algorithm.

Here is my try:

sSund :: Integer -> [Integer]
sSund n = [ i * 2 + 1 | i <- [1..n], j <- [f i], (i + j + 2 * i * j) > n ] 
  where f 1 = 1 
        f y = y + 1 --use function f because i don't know how insert 1 into j's list

But it gives some wrong numbers like 9,15,21,25, etc.

*Main> sSund 30
[7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61]

What am I doing wrong?

Solution

How it works

Sundaram's seive works by focussing on the odd numbers 2n+1, and excluding those that are the product of numbers.

If two numbers multiply to make an odd number, they must both be odd, so our number 2n+1 = (2i+1)(2j+1). If we multiply that out we get 2n+1 = 4ij + 2i +2j + 1, which we can simplify to 2n=4ij+2i+2j, which again simplifies to n=2ij+i+j. So we don't want n if we can write it as 2ij+i+j. This is true for any numbers i and j, but it's OK to just get rid of the ones where i<=j, because otherwise you're definitely excluding the same number twice.

Fixing your code

In your code, you generate some numbers i + j + 2 * i * j to be excluded, but you in fact just exclude the i instead of the i + j + 2 * i * j. The j<-[f i] just gives you a single j value in a list instead all the numbers from i up to n, which you should write as [i..n].

It's much simpler to just generate the exclusion list first:

sSundDelete :: Integer -> [Integer]            
sSundDelete n = [i+j+2*i*j|i<-[1..n], j<-[i..n]]

Here I've decided to just allow i and j to be between 1 and n, because otherwise 2ij+i+j is definitely bigger than n.

Now we can make a list of numbers x which don't include these numbers, and then make them odd with the formula 2*n+1:

sSund :: Integer -> [Integer]
sSund n = let del = sSundDelete n in
     2:[2*x+1 | x <- [1..n], not (x `elem` del)]

Which correctly gives you

> sSund 30
[2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61]

Speeding things up

It's not as fast as it could be, though, because if you look at

> sSundDelete 10
[4,7,10,13,16,19,22,25,28,31,12,17,22,27,32,37,42,47,52,24,31,38,45,52,59,66,73,40,49,58,67,76,85,94,60,71,82,93,104,115,84,97,110,123,136,112,127,142,157,144,161,178,180,199,220]

it has numbers much bigger than we need - sSund 10 only goes as far as 2*10+1=21. This means we're checking our numbers again and again against numbers that we didn't consider anyway!

The simplest thing to do about this is to rewrite sSundDelete to say

sSundDelete n = [i+j+2*i*j|i<-[1..n], j<-[i..n],i+j+2*i*j<=n]

very much as you did, or

sSundDelete n = filter (<= n) [i+j+2*i*j|i<-[1..n], j<-[i..n]]

Using a bit of maths to speed things up

The problem with these is that they generate too many numbers and then throw them away. It would be faster to generate only the numbers we need.

Actually, I think it's best to calculate how far to go. the smallest j we will ever use is i, so the smallest that 2ij+i+j can be is 2i²+2i. If we don't want that to be over n, we want 2i²+2i<=n, which we can rewrite as 2i(i+1)<=n. Correctness is more important than efficiency, so it's OK to go over n a bit, but it's important not to miss out numbers below n, so we're OK to say 2i²<=n. This can be expressed as i <= floor (sqrt (fromIntegral n / 2)) (floor truncates decimals, so floor 35.7 is 35, and fromIntegral is used here to convert n to a floating point number (allowing non-integers) so we can do division and square roots.

That was a lot of working out, but now we can just calculate once how big i should go:

sSundDelete n = filter (<= n) [i+j+2*i*j|i<-[1..floor (sqrt (fromIntegral n / 2))], j<-[i..n]]

We can do a similar job on j. We want 2ij+i+j<=n, which we can rearrange to say (2i+1)j<=n-i which can be done as j<=floor( (n'-i')/(2*i'+1)) where i'=fromIntegral i and n'=fromIntegral n. This gives us

sSundDelete n = [i+j+2*i*j|let n'=fromIntegral n,
                           i<-[1..floor (sqrt (n' / 2))],
                           let i' = fromIntegral i,
                           j<-[i..floor( (n'-i')/(2*i'+1))]]

This makes it fast enough for me to not give up waiting for sSund 5000 to calculate the second prime number!