Why does this code report a stack overflow in Clojure

This is a simple attempt to reproduce some code that Ross Ihaka gave as an example of poor R performance. I was curious as to whether Clojure's persistent data structures would offer any improvement. (https://www.stat.auckland.ac.nz/~ihaka/downloads/JSM-2010.pdf)

However , I'm not even getting to first base, with a Stack Overflow reported, and not much else to go by. Any ideas? Apologies in advance if the question has an obvious answer I've missed...

; Testing Ross Ihaka's example of poor R performance
; against Clojure, to see if persisntent data structures help

(def dd (repeat 60000 '(0 0 0 0)))

(defn repl-row [d i new-r]
   (concat (take (dec i) d) (list new-r) (drop i d)))

(defn changerows [d new-r]
  (loop [i 10000
         data d]
    (if (zero? i)
      data
      (let [j (rand-int 60000)
            newdata (repl-row data j new-r)]
        (recur (dec i) newdata)))))


user=> (changerows dd '(1 2 3 4))

StackOverflowError   clojure.lang.Numbers.isPos (Numbers.java:96)

Further, if anyone has any ideas how persistent functional data structures can be used to best advantage in the example above, I'd be very keen to hear. The speedup reported not using immutable structures (link above) was about 500%!

Solution

Looking at the stack trace for the StackOverflowError, this seems to be an "exploding thunk" (lazy/suspended calculation) problem that isn't obviously related to the recursion in your example:

java.lang.StackOverflowError
    at clojure.lang.RT.seq(RT.java:528)
    at clojure.core$seq__5124.invokeStatic(core.clj:137)
    at clojure.core$concat$cat__5217$fn__5218.invoke(core.clj:726)
    at clojure.lang.LazySeq.sval(LazySeq.java:40)
    at clojure.lang.LazySeq.seq(LazySeq.java:49)
    at clojure.lang.RT.seq(RT.java:528)
    at clojure.core$seq__5124.invokeStatic(core.clj:137)
    at clojure.core$take$fn__5630.invoke(core.clj:2876)

Changing this line to realize newdata into a vector resolves the issue:

(recur (dec i) (vec newdata))

This workaround is to address the use of concat in repl-row, by forcing concat's lazy sequence to be realized in each step. concat returns lazy sequences, and in your loop/recur you're passing in the lazy/unevaluated results of previous concat calls as input to subsequent concat calls which returns more lazy sequences based on previous, unrealized lazy sequences. The final concat-produced lazy sequence isn't realized until the loop finishes, which results in a stack overflow due its dependence on thousands of previous concat-produced lazy sequences.

Further, if anyone has any ideas how persistent functional data structures can be used to best advantage in the example above, I'd be very keen to hear.

Since it seems the usage of concat here is to simply replace an element in a collection, we can get the same effect by using a vector and assoc-ing the new item into the correct position of the vector:

(def dd (vec (repeat 60000 '(0 0 0 0))))

(defn changerows [d new-r]
  (loop [i 10000
         data d]
    (if (zero? i)
      data
      (let [j (rand-int 60000)
            newdata (assoc data j new-r)]
        (recur (dec i) newdata)))))

Notice there's no more repl-row function, we just assoc into data using the index and the new value. After some rudimentary benchmarking with time, this approach seems to be many times faster:

"Elapsed time: 75836.412166 msecs" ;; original sample w/fixed overflow
"Elapsed time: 2.984481 msecs"     ;; using vector+assoc instead of concat

And here's another way to solve it by viewing the series of replacements as an infinite sequence of replacement steps, then sampling from that sequence:

(defn random-replace [replacement coll]
  (assoc coll (rand-int (count coll)) replacement))

(->> (iterate (partial random-replace '(1 2 3 4)) dd)
     (drop 10000) ;; could also use `nth` function
     (first))