Search code examples
javaperformanceclojure

Why does Clojure spend so much time on clojure.lang.Iterate.first?


I'm quite fond of the story about Frank Nelson Cole, who in 1903 demonstrated the prime factorization of 2^67 - 1 in a famous "lecture without words". These days the factorization can be readily found using the following naive algorithm:

(def mersenne67 (dec (expt 2 67)))

(->> (iterate inc 2)
     (filter #(zero? (rem mersenne67 %)))
     (first))

However I've noticed that this Clojure code takes roughly twice as long as the equivalent Java or Kotlin code. (~40 vs ~20 seconds on my machine)
Here's the Java I compared it to:

  public static BigInteger mersenne() {
    BigInteger mersenne67 = 
      BigInteger.valueOf(2).pow(67).subtract(BigInteger.ONE);

    return Stream.iterate(BigInteger.valueOf(2), (x -> x.add(BigInteger.ONE)))
      .filter(x -> mersenne67.remainder(x).equals(BigInteger.ZERO))
      .findFirst()
      .get();
  }

Rewriting the Clojure code on a lower level made no difference:

(def mersenne67 (-> (BigInteger/valueOf 2)
                (.pow (BigInteger/valueOf 67))
                (.subtract BigInteger/ONE)))

(->> (iterate #(.add ^BigInteger % BigInteger/ONE) (BigInteger/valueOf 2))
     (filter #(= BigInteger/ZERO (.remainder ^BigInteger mersenne67 %)))
     (first))

After profiling the code with the VisualVM the primary suspect seems to be clojure.lang.Iterate.first() which accounts almost exactly for the difference in how long these functions run. Java's equivalent java.util.stream.ReferencePipeline.findFirst() runs for only a fraction as long (~22 vs ~2 seconds). This leads to my question: How does Java (and Kotlin) get away with spending so much less time on this task?


Solution

  • I apologize in advance for grave-digging, but I'm a bit concerned about ClojureMostly's answer. It surely solves the problem in timely manner but it looks like a dirty hack to me: passing anonymous reducing function, which ignores current result (_) and finishes as soon as first factor is found (reduced).

    How about using transducers and transduce function:

    (defn first-factor-tr 
      [^BigInteger n] 
      (transduce
        (comp (filter #(identical? BigInteger/ZERO (.remainder ^BigInteger n %))) (take 1))
        conj nil
        (iterate #(.add ^BigInteger % BigInteger/ONE) (BigInteger/valueOf 2))))
    

    Filter out all values in collection with remainder of zero (filter ...) and take first one (take ...).

    Execution time of this solution is on par with the one with reduce:

    (defn first-factor-reduce 
      [^BigInteger n] 
      (reduce
        (fn [_ x]
          (when (identical? BigInteger/ZERO (.remainder ^BigInteger n x))
                (reduced x)))
        nil
        (iterate #(.add ^BigInteger % BigInteger/ONE) (BigInteger/valueOf 2))))
    
    => (time (first-factor-tr mersenne67))
    "Elapsed time: 20896.594178 msecs"
    (193707721)
    => (time (first-factor-reduce mersenne67))
    "Elapsed time: 20424.323894 msecs"
    193707721