Search code examples
clojurestack-overflowlazy-evaluation

Clojure - StackOverflowError while iterating over lazy collection


I am currently implementing solution for one of Project Euler problems, namely Sieve of Eratosthenes (https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes), in Clojure. Here's my code:

(defn cross-first-element [coll]
  (filter #(not (zero? (rem % (first coll)))) coll))

(println
  (last
  (map first
    (take-while
      (fn [[primes sieve]] (not (empty? sieve)))
      (iterate
        (fn [[primes sieve]] [(conj primes (first sieve)) (cross-first-element sieve)])
        [[] (range 2 2000001)])))))

The basic idea is to have two collections - primes already retrieved from the sieve, and the remaining sieve itself. We start with empty primes, and until the sieve is empty, we pick its first element and append it to primes, and then we cross out the multiples of it from the sieve. When it's exhausted, we know we have all prime numbers from below two millions in the primes.

Unfortunately, as good as it works for small upper bound of sieve (say 1000), it causes java.lang.StackOverflowError with a long stacktrace with repeating sequence of:

...
clojure.lang.RT.seq (RT.java:531)
clojure.core$seq__5387.invokeStatic (core.clj:137)
clojure.core$filter$fn__5878.invoke (core.clj:2809)
clojure.lang.LazySeq.sval (LazySeq.java:42)
clojure.lang.LazySeq.seq (LazySeq.java:51)
...

Where is the conceptual error in my solution? How to fix it?


Solution

  • the reason for this is the following: since the filter function in your cross-first-element is lazy, it doesn't actually filter your collection on every iterate step, rather it 'stacks' filter function calls. This leads to the situation that when you are going to actually need the resulting element, the whole load of test functions would be executed, roughly like this:

    (#(not (zero? (rem % (first coll1))))
      (#(not (zero? (rem % (first coll2))))
        (#(not (zero? (rem % (first coll3))))
           ;; and 2000000 more calls
    

    leading to stack overflow.

    the simplest solution in your case is to make filtering eager. You can do it by simply using filterv instead of filter, or wrap it into (doall (filter ...

    But still your solution is really slow. I would rather use loop and native arrays for that.