Was looking at the exercises at the bottom of chapter 9 of clojure for the brave and true (in particular the last one of searching multiple engines and returning the first hit of each)
I mocked the actual search with slurp part to be this:
(defn search-for
[query engine]
(Thread/sleep 2000)
(format "https://www.%s.com/search?q%%3D%s", engine query))
And implemented the behavior like this:
(defn get-first-hit-from-each
[query engines]
(let [futs (map (fn [engine]
(future (search-for query engine))) engines)]
(doall futs)
(map deref futs)))
(I know the return here is a list and the exercise asks for a vector but can just do an into
for that...)
but when i run this in the REPL
(time (get-first-hit-from-each "gray+cat" '("google" "bing")))
it seems to take 2 seconds after I added the doall
(since map returns a lazy seq i don't think any of the futures even start unless i consume the seq, (last futs)
also seems to work) but when I use the time
macro in the REPL it reports almost no time consumed even if it is taking 2 seconds:
(time (get-first-hit-from-each "gray+cat" '("google" "bing")))
"Elapsed time: 0.189609 msecs"
("https://www.google.com/search?q%3Dgray+cat" "https://www.bing.com/search?q%3Dgray+cat")
what is going on with the time
macro here?
TL;DR: Lazy seqs don't play nice with the time
macro, and your function get-first-hit-from-each
returns a lazy seq. To make lazy seqs work with time
, wrap them in a doall
, as suggested by the documentation. See below for a fuller thought process:
The following is the definition of the time
macro in clojure.core
(source):
(defmacro time
"Evaluates expr and prints the time it took. Returns the value of
expr."
{:added "1.0"}
[expr]
`(let [start# (. System (nanoTime))
ret# ~expr]
(prn (str "Elapsed time: " (/ (double (- (. System (nanoTime)) start#)) 1000000.0) " msecs"))
ret#))
Notice how the macro saves the return value of expr
in ret#
, right after which it prints the elapsed time? Only after that does ret#
get returned. The key here is that your function get-first-hit-from-each
returns a lazy sequence (since map
returns a lazy sequence):
(type (get-first-hit-from-each "gray+cat" '("google" "bing")))
;; => clojure.lang.LazySeq
As such, when you do (time (get-first-hit-from-each "gray+cat" '("google" "bing")))
, what gets saved in ret#
is a lazy sequence, which doesn't actually get evaluated until we try to use its value...
We can check whether a lazy sequence has been evaluated using the realized?
function. So let's tweak the time
macro by adding a line to check whether ret#
has been evaluated, right after the elapsed time is printed:
(defmacro my-time
[expr]
`(let [start# (. System (nanoTime))
ret# ~expr]
(prn (str "Elapsed time: " (/ (double (- (. System (nanoTime)) start#)) 1000000.0) " msecs"))
(prn (realized? ret#)) ;; has the lazy sequence been evaluated?
ret#))
Now testing it:
(my-time (get-first-hit-from-each "gray+cat" '("google" "bing")))
"Elapsed time: 0.223054 msecs"
false
;; => ("https://www.google.com/search?q%3Dgray+cat" "https://www.bing.com/search?q%3Dgray+cat")
Nope... so that's why time
prints inaccurately. None of the computationally-long stuff actually gets to run before the printout is made.
To fix that and get an accurate time, we need to ensure evaluation of the lazy seq, which can be done by strategically placing a doall
in a bunch of possible places, either within your function, wrapping the map
:
(defn get-first-hit-from-each
[query engines]
(let [futs (map (fn [engine]
(future (search-for query engine))) engines)]
(doall futs)
(doall (map deref futs))))
;; => #'propeller.core/get-first-hit-from-each
(time (get-first-hit-from-each "gray+cat" '("google" "bing")))
"Elapsed time: 2005.478689 msecs"
;; => ("https://www.google.com/search?q%3Dgray+cat" "https://www.bing.com/search?q%3Dgray+cat")
or within time
, wrapping the function call:
(time (doall (get-first-hit-from-each "gray+cat" '("google" "bing"))))