Are ref's safe to use without a mutex in a parallel environment

I am writing a fairly asynchronous program using the Thread library. I'll have a lot of different threads that need to access string list ref, is this safe without using a Mutex? I learned earlier today that Hashtbl requires a Mutex, but I couldn't find any information on ref.

Solution

In brief, if you have concurrent writes to mutable shared resource, you should protect them with a mutex (or atomics).

In more details, there are three important points to keep in mind.

First, OCaml threads are not parallel. In OCaml 4 and before, the OCaml runtime uses a runtime lock to ensure that only one OCaml thread executes at any point in time. In OCaml 5, the unit of parallelism is domain, and to preserve backward compatibility, each domain uses a domain lock to ensure that only one OCaml thread executes by domain.

Second, OCaml is always memory safe. Even in presence of race conditions, memory access are guaranteed to be well-typed. For reference, this means that all values that you read from the reference will be values that were written to the reference; and not some strange amalgamate of your program states.

However, without synchronization, concurrent programs are not guaranteed to behave the same way as an equivalent sequential program.

For instance, the following program will reach the assert false clause

let global = ref []
let sum = ref 0
let incr () =
  let state = !global in
  let a = Array.init 1_000 Fun.id in
  let updated = a :: state in
  global := updated

let decr () =
  match !global with
  | [] -> assert false
  | _ :: q ->
     global := q

let balanced () =
  let incrs = List.init 100 (fun _ -> Thread.create incr ()) in
  let () = List.iter Thread.join incrs in
  let decrs = List.init 100 (fun _ -> Thread.create decr ()) in
  List.iter Thread.join decrs

let () =
   while true do balanced () done

even if all calls to incr and decr are well balanced. The reason is that the read and write to the shared global references in incr and decr are not guarantees to happen at the same time. Thus it is possible that two calls to incr are interleaved in this way:

(* first call to incr *)               | (* Second call to incr *)                   
  let state = !global in               |
  let a = Array.init 1_000 Fun.id in   |
                                       | let state = !global in               
  let updated = a :: state in          |
  global := updated                    | let a = Array.init 1_000 Fun.id in 
                                       | let updated = a :: state in  
                                       | global := updated

which means that the second call to incr erases the update done by the first one, and after two calls to incr we end up with only one new element in the global list.

Consequently, synchronization primitives are a necessity as soon as you may have concurrent writes to the same mutable shared resource.

Third, in OCaml 5 (aka with parallelism) references cannot be used as synchronization primitives. This is the difference between references and atomics. In particular, if you have

module M: sig
   val update: unit -> unit
   val read: unit -> int option
end = struct
let written = ref false
let answer = ref 0

let update () =
  answer := 1;
  written := true

let read () =
  if !written then Some !answer else None
end

then on some CPU architecture it might happen than read () returns Some 0 because there is no guarantee than the write to answer is seen before the write to written.