Search code examples
ocamlmutex

Are ref's safe to use without a mutex in a parallel environment


I am writing a fairly asynchronous program using the Thread library. I'll have a lot of different threads that need to access string list ref, is this safe without using a Mutex? I learned earlier today that Hashtbl requires a Mutex, but I couldn't find any information on ref.


Solution

  • In brief, if you have concurrent writes to mutable shared resource, you should protect them with a mutex (or atomics).

    In more details, there are three important points to keep in mind.

    First, OCaml threads are not parallel. In OCaml 4 and before, the OCaml runtime uses a runtime lock to ensure that only one OCaml thread executes at any point in time. In OCaml 5, the unit of parallelism is domain, and to preserve backward compatibility, each domain uses a domain lock to ensure that only one OCaml thread executes by domain.

    Second, OCaml is always memory safe. Even in presence of race conditions, memory access are guaranteed to be well-typed. For reference, this means that all values that you read from the reference will be values that were written to the reference; and not some strange amalgamate of your program states.

    However, without synchronization, concurrent programs are not guaranteed to behave the same way as an equivalent sequential program.

    For instance, the following program will reach the assert false clause

    let global = ref []
    let sum = ref 0
    let incr () =
      let state = !global in
      let a = Array.init 1_000 Fun.id in
      let updated = a :: state in
      global := updated
    
    let decr () =
      match !global with
      | [] -> assert false
      | _ :: q ->
         global := q
    
    let balanced () =
      let incrs = List.init 100 (fun _ -> Thread.create incr ()) in
      let () = List.iter Thread.join incrs in
      let decrs = List.init 100 (fun _ -> Thread.create decr ()) in
      List.iter Thread.join decrs
    
    let () =
       while true do balanced () done
    

    even if all calls to incr and decr are well balanced. The reason is that the read and write to the shared global references in incr and decr are not guarantees to happen at the same time. Thus it is possible that two calls to incr are interleaved in this way:

    (* first call to incr *)               | (* Second call to incr *)                   
      let state = !global in               |
      let a = Array.init 1_000 Fun.id in   |
                                           | let state = !global in               
      let updated = a :: state in          |
      global := updated                    | let a = Array.init 1_000 Fun.id in 
                                           | let updated = a :: state in  
                                           | global := updated
    

    which means that the second call to incr erases the update done by the first one, and after two calls to incr we end up with only one new element in the global list.

    Consequently, synchronization primitives are a necessity as soon as you may have concurrent writes to the same mutable shared resource.

    Third, in OCaml 5 (aka with parallelism) references cannot be used as synchronization primitives. This is the difference between references and atomics. In particular, if you have

    module M: sig
       val update: unit -> unit
       val read: unit -> int option
    end = struct
    let written = ref false
    let answer = ref 0
    
    let update () =
      answer := 1;
      written := true
    
    let read () =
      if !written then Some !answer else None
    end
    

    then on some CPU architecture it might happen than read () returns Some 0 because there is no guarantee than the write to answer is seen before the write to written.