Search code examples
concurrencyclojurefunctional-programming

How to atomically check if a key exists in a map and add it if it doesn't exist


I am trying to generate a new key that doesn't exist in my map (atom), then immediately add it to my map and return the key. However, the check for the key and the update are not done atomically. I am wondering how to do this atomically so that it is safe for concurrency.

Ideally this key is short enough to type but hard to guess (so a user can create a session, and his/her friends can join with the key). So 0,1,2,3... is not ideal since a user can try enter sessions n-1. Something like UUID where I don't have to worry about collisions is also not ideal. I was planning on generating a short random string (e.g. "udibwi") but I've used rand-int 25 in the code snippet below to simplify the problem.

I've written a function which randomly generates a key. It checks if the map contains it. If it already does then try a new key. If it doesn't, associate it to my map and then return the key.

This works but I don't think it is safe for multiple threads. Is there a way to do this using atoms or is there a better way?

(defonce sessions (atom {}))

(defn getNewSessionId []
  (let [id (rand-int 25)]
    (if (contains? @sessions id) 
      (createNewId) 
      (do 
        (swap! sessions assoc id "")
        id))))

Solution

  • You're trying to do too much at once. Having that one function generate an ID and update the atom is complicating things. I'd break this down into three functions:

    • A function that generates an ID based on an existing map
    • A function that updates a plain, immutable map using the above function
    • A function that updates an atom (although this will be so simple after implementing the previous two functions that it may not be necessary at all).

    Something like:

    ; Notice how this doesn't deal with atoms at all
    (defn generate-new-id [old-map]
      (let [new-id (rand-int 25)]
        (if (old-map new-id) ; or use "contains?"
          (recur old-map) ; Using "recur" so we don't get a StackOverflow
          new-id)))
    
    ; Also doesn't know anything about the atom
    (defn assoc-new-id [old-map]
      (let [new-id (generate-new-id old-map)]
        (assoc old-map new-id "")))
    
    (defonce data (atom {}))
    
    (defn swap-new-id! []
      (swap! data assoc-new-id))
    

    The main changes:

    • Everything that could be removed from the atom swapping logic was moved to its own function. This allows you to just pass the function handling all the logic to swap! and it will be handled atomically.

    • Clojure uses dash-case, not camelCase.

    • I used recur instead of actual recursion so you won't get a StackOverflow while the ID is being brute-forced.

    Of course though, this suffers from problems if the available number of IDs left is small. It may take a long time for it to "find" an available ID via brute-force. You might be better off using a "generator" backed by an atom to produce IDs atomically starting from 0:

    (defn new-id-producer []
      (atom -1))
    
    (defn generate-id [producer]
      (swap! producer inc)) ; "swap!" returns the new value that was swapped in
    
    (let [producer (new-id-producer)]
     ; Could be run on multiple threads at once
     (doseq [id (repeatedly 5 #(generate-id producer))]
       (println id))) 
    0
    1
    2
    3
    4
    => nil
    

    I tried to write an example of this operating on multiple threads at once:

    (let [producer (new-id-producer)
    
          ; Emulate the "consumption" of IDs
          consume (fn []
                    (doseq [id (repeatedly 20 #(generate-id producer))]
                      (println (.getId (Thread/currentThread)) id)))]
    
      (doto (Thread. consume)
            (.start))
    
      (doto (Thread. consume)
            (.start)))
    
    37 0
    3738 1
    38 3
    38 4
    38 5
    38 6
    38 7
    38 8
    38 9
    38 10
    38 11
    38 12
    38 13
    38 14
    38 15
    38 16
    38 17
    38 18
    38 19
    38 20
    38 21
     2
    37 22
    37 23
    37 24
    37 25
    37 26
    37 27
    37 28
    37 29
    37 30
    37 31
    37 32
    37 33
    37 34
    37 35
    37 36
    37 37
    37 38
    37 39
    

    But the un-synchronized nature of the printing to the outstream made this output a mess. If you squint a bit though, you can see that the threads (with Thread IDs of 37 and 38) are taking turns.


    If you need the new ID returned, the only clean way I know of that doesn't involve locking is to use a second atom to get the returned ID out of the swapping function. This requires getting rid of assoc-new-id:

    (defn generate-new-id [old-map]
      (let [new-id (rand-int 25)]
        (if (old-map new-id)
          (recur old-map)
          new-id)))
    
    (defn swap-new-id! [old-map]
      (let [result-atom (atom nil)]
    
        (swap! data (fn [m]
                      (let [id (generate-new-id m)]
                        (reset! result-promise id) ; Put the ID in the result atom
                        (assoc m id ""))))
    
        @result-promise)) ; Then retrieve it here
    

    Or, if a very inefficient solution is fine and you're using Clojure 1.9.0, you can just search the maps to find what key was added using clojure.set.difference:

    (defn find-new-id [old-map new-map]
      (clojure.set/difference (set (keys new-map))
                              (set (keys old-map))))
    
    (defn swap-new-id! []
      (let [[old-map new-map] (swap-vals! data assoc-new-id)] ; New in 1.9.0
        (find-new-id new-map old-map)))
    

    But again, this is very inefficient. It requires two iterations of each map.