Search code examples
clojure

Syntax for passing pmap a function with multiple parameters


I'm new to Clojure and I'll try my best to explain the question. I have a large hash-map that I'm splitting up into partitions in order to run a page rank operation in parallel. I want to use pmap, but I'm not understanding the syntax required. The calc-page-rank function takes four parameters. The collection that I want to use pmap on is mapSegments (a sequence of n maps), so it goes one by one through the sequence of maps and runs calc-pank-rank. I've used map with a single parameter, but not multiple. My current syntax does not work. I'm not sure if what I'm trying to do is even possible.

The combine-maps function just converts a map sequence to one map. Example of mapSegments: ({0 [2 3 5 7 9], 1 [3 2 4 5], 2 [1]} {3 [0], 4 [1 5 8 9], 5 [1 0 6 9]} {6 [1 2 3], 7 [2 1 0], 8 [9 10 1]} {9 [2 3 1 8 7], 10 [1]})

My code:

(defn calc-page-rank [myMap inpagesMap outpagesCountMap pageRankMap]
  (def d 0.85) ;; damping factor
  (def p (- 1 d)) ;; probability of giving up

  (combine-maps (for [[pageID inpages] myMap] 
                  (into {} {pageID  (+ p (* d (reduce + (for [i inpages]
                                                          (/ (get pageRankMap i) (get outpagesCountMap i))))))}))))      

(defn main-body [myMap]  
  (def inpagesMap (find-inpages myMap))
  (def outpagesCountMap (count-outpages myMap)) 
  (def mapSegments (map vec-to-map (split-map 4 myMap))) ;; returns a sequence of 4 hash maps 

; this is the part I'm confused with
  (print (combine-maps (pmap calc-page-rank mapSegments inpagesMap outpagesCountMap (start-rank myMap 1)))) ;; for each segment of the map, run calc-page-rank

; ....  
  (shutdown-agents)

Solution

  • If I am understanding the question, the first two calls to calc-page-rank would look like:

    (calc-page-rank (first mapSegments) inpagesMap outpagesCountMap (start-rank myMap 1))
    (calc-page-rank (second mapSegments) inpagesMap outpagesCountMap (start-rank myMap 1))
    

    If this is correct and the rest of the parameters do not change from one call to the next, you can create a new function that takes one parameter.

    (fn [segment] (calc-page-rank segment inpagesMap outpagesCountMap (start-rank myMap 1)))
    

    This can be simplified by using a shortcut form of the anonymous function by putting a # in front of the calc-page-rank call and using % for the parameter value.

    (mapv #(calc-page-rank % inpagesMap outpagesCountMap (start-rank myMap 1)) mapSegments)