I am writing a fairly asynchronous program using the Thread
library. I'll have a lot of different threads that need to access string list ref
, is this safe without using a Mutex
? I learned earlier today that Hashtbl
requires a Mutex
, but I couldn't find any information on ref
.
In brief, if you have concurrent writes to mutable shared resource, you should protect them with a mutex (or atomics).
In more details, there are three important points to keep in mind.
First, OCaml threads
are not parallel. In OCaml 4 and before, the OCaml runtime uses a runtime lock to ensure that only one OCaml thread executes at any point in time. In OCaml 5, the unit of parallelism is domain, and to preserve backward compatibility, each domain uses a domain lock to ensure that only one OCaml thread executes by domain.
Second, OCaml is always memory safe. Even in presence of race conditions, memory access are guaranteed to be well-typed. For reference, this means that all values that you read from the reference will be values that were written to the reference; and not some strange amalgamate of your program states.
However, without synchronization, concurrent programs are not guaranteed to behave the same way as an equivalent sequential program.
For instance, the following program will reach the assert false
clause
let global = ref []
let sum = ref 0
let incr () =
let state = !global in
let a = Array.init 1_000 Fun.id in
let updated = a :: state in
global := updated
let decr () =
match !global with
| [] -> assert false
| _ :: q ->
global := q
let balanced () =
let incrs = List.init 100 (fun _ -> Thread.create incr ()) in
let () = List.iter Thread.join incrs in
let decrs = List.init 100 (fun _ -> Thread.create decr ()) in
List.iter Thread.join decrs
let () =
while true do balanced () done
even if all calls to incr
and decr
are well balanced. The reason is that
the read
and write
to the shared global
references in incr
and decr
are not guarantees to happen at the same time. Thus it is possible that two
calls to incr
are interleaved in this way:
(* first call to incr *) | (* Second call to incr *)
let state = !global in |
let a = Array.init 1_000 Fun.id in |
| let state = !global in
let updated = a :: state in |
global := updated | let a = Array.init 1_000 Fun.id in
| let updated = a :: state in
| global := updated
which means that the second call to incr
erases the update done by the first one, and after two calls to incr
we end up with only one new element in the global list.
Consequently, synchronization primitives are a necessity as soon as you may have concurrent writes to the same mutable shared resource.
Third, in OCaml 5 (aka with parallelism) references cannot be used as synchronization primitives. This is the difference between references and atomics. In particular, if you have
module M: sig
val update: unit -> unit
val read: unit -> int option
end = struct
let written = ref false
let answer = ref 0
let update () =
answer := 1;
written := true
let read () =
if !written then Some !answer else None
end
then on some CPU architecture it might happen than read ()
returns Some 0
because there is no guarantee than the write to answer
is seen before the write to written
.