Search code examples
haskellclojurefunctional-programmingcurryingtransducer

What's the difference between these functions implemented with currying and transducers?


Taking the three functions below, implemented in Haskell and Clojure respectively:

f :: [Int] -> Int
f = foldl1 (+) . map (*7) . filter even

(defn f [coll]
  ((comp
    (partial reduce +)
    (partial map #(* 7 %)
    (partial filter even?)) coll))

(defn f [coll]
  (transduce
    (comp
      (filter even?)
      (map #(* 7 %)))
     + coll))

when they are applied to a list like [1, 2, 3, 4, 5] they all return 42. I know the machinery behind the first 2 is similar since map is lazy in Clojure, but the third one uses transducers. Could someone show the intermediate steps for the execution of these functions?


Solution

  • The intermediate steps between the second and third example are the same for this specific example. This is due to the fact that map and filter are implemented as lazy transformations of a sequence into a sequence, as you’re no-doubt already aware.

    The transducer versions of map and filter are defined using the same essential functionality as the not-transducer versions, except that the way they “conj" (or not, in the case of filter) onto the result stream is defined elsewhere. Indeed, if u look at the source for map, there are explicit data-structure construction functions in use, whereas the transducer variant uses no such functions -- they are passed in via rf. Explicitly using cons in the non-transducer versions means they're always going to be dealing with sequences

    IMO, the main benefit of using transducers is that you have the ability to define the process that you're doing away from the thing which will use your process. Therefore perhaps a more interesting rewrite of your third example may be:

    (def process (comp (filter even)
                       (map #(* 7 %))))
    
    (defn f [coll] (transduce process + collection))
    

    Its an exercise to the application author to decide when this sort of abstraction is necessary, but it can definitely open an opportunity for reuse.


    It may occur to you that you can simply rewrite

    (defn process [coll]
      ((comp
        (partial map #(* 7 %)
        (partial filter even?)) coll))
    
    (reduce + (process coll))
    

    And get the same effect; this is true. When your input is always a sequence (or always the same kind of stream / you know what kind of stream it will be) there's arguably not a good reason to create a transducer. But the power of reuse can be demonstrated here (assume process is a transducer)

    (chan 1 process)  ;; an async channel which runs process on all inputs
    
    (into [] process coll)  ;; writing to a vector
    
    (transduce + process coll)  ;; your goal
    

    The motivation behind transducers was essentially to stop having to write new collection functions for different collection types. Rich Hickey mentions his frustration writing functions like map< map> mapcat< mapcat>, and so on in the core async library -- what map and mapcat are, is already defined, but because they assume that they operate on sequences (that explicit cons I linked above), they cant be applied to asnychronous channels. But channels can supply their own rf in the transducer version to let them reuse these functions.