Search code examples
clojure

seq function caveats in clojure


In the docstring of clojure's seq function, it mentions:

Note that seqs cache values, thus seq should not be used on any Iterable whose iterator repeatedly returns the same mutable object.

What does this sentence mean? Why emphasize the same mutable object?


Solution

  • The comment was added later and mentions this ticket:

    Some Java libraries return iterators that return the same mutable object on every call:

    • Hadoop ReduceContextImpl$ValueIterator
    • Mahout DenseVector$AllIterator/NonDefaultIterator
    • LensKit FastIterators

    While careful usage of seq or iterator-seq over these iterators worked in the past, that is no longer true as of the changes in CLJ-1669 - iterator-seq now produces a chunked sequence. Because next() is called 32 times on the iterator before the first value can be retrieved from the seq, and the same mutable object is returned every time, code on iterators like this now receives different (incorrect) results.

    Approach: Sequences cache values and are thus incompatible with holding mutable and mutating Java objects. We will add some clarification about this to seq and iterator-seq docstrings. For those iterators above, it is recommended to either process those iterators in a loop/recur or to wrap them in a lazy-seq that transforms each re-returned mutable object into a proper value prior to caching.