Search code examples
algorithmfunctional-programmingprimessieve-of-eratosthenes

Is there a fast, functional prime generator?


Suppose I've got a natural number n and I want a list (or whatever) of all primes up to n.

The classic prime sieve algorithm runs in O(n log n) time and O(n) space -- it's fine for more imperative languages, but requires in-place modification to lists and random access, in a fundamental way.

There's a functional version involving priority queues, which is pretty slick -- you can check it out here. This has better space complexity at about O(n / log(n)) (asymptotically better but debatable at practical scales). Unfortunately the time analysis is nasty, but it's very nearly O(n^2) (actually, I think it's about O(n log(n) Li(n)), but log(n) Li(n) is approximately n).

Asymptotically speaking it would actually be better just to check the primality of each number as you generate it, using successive trial division, as that would take only O(1) space and O(n^{3/2}) time. Is there a better way?

Edit: it turns out my calculations were simply incorrect. The algorithm in the article is O(n (log n) (log log n)), which the articles explains and proves (and see the answer below), not the complicated mess I put above. I'd still enjoy seeing a bona-fide O(n log log n) pure algorithm if there is one out there.


Solution

  • Here's a Haskell implementation of Melissa O'Neill's algorithm (from the linked article). Unlike the implementation that Gassa linked to, I've made minimal use of laziness, so that the performance analysis is clear -- O(n log n log log n), i.e., linearithmic in n log log n, the number of writes made by the imperative Sieve of Eratosthenes.

    The heap implementation is just a tournament tree. The balancing logic is in push; by swapping the children every time, we ensure that, for every branch, the left subtree is the same size or one bigger compared to the right subtree, which ensures depth O(log n).

    module Sieve where
     
    type Nat = Int
     
    data Heap = Leaf !Nat !Nat
              | Branch !Nat !Heap !Heap
              deriving Show
     
    top :: Heap -> Nat
    top (Leaf n _) = n
    top (Branch n _ _) = n
     
    leaf :: Nat -> Heap
    leaf p = Leaf (3 * p) p
     
    branch :: Heap -> Heap -> Heap
    branch h1 h2 = Branch (min (top h1) (top h2)) h1 h2
     
    pop :: Heap -> Heap
    pop (Leaf n p) = Leaf (n + 2 * p) p
    pop (Branch _ h1 h2)
      = case compare (top h1) (top h2) of
            LT -> branch (pop h1) h2
            EQ -> branch (pop h1) (pop h2)
            GT -> branch h1 (pop h2)
     
    push :: Nat -> Heap -> Heap
    push p h@(Leaf _ _) = branch (leaf p) h
    push p (Branch _ h1 h2) = branch (push p h2) h1
     
    primes :: [Nat]
    primes
      = let helper n h
              = case compare n (top h) of
                    LT -> n : helper (n + 2) (push n h)
                    EQ -> helper (n + 2) (pop h)
                    GT -> helper n (pop h)
          in 2 : 3 : helper 5 (leaf 3)