Search code examples
ocaml

How to implement add function for immutable doubly linked list?


This question has theoretical motivation only (of course, there are multiple ways to avoid it in real-life development).

Let's introduce an immutable doubly linked list as the following OCaml type.

type t =
  | Link of 
     { value : int; 
        next : t; 
        prev : t 
     }
  | End 

We can easily make an instance with three elements within recursive definitions of (non-functional) values that OCaml allows:

let _123 = 
  let rec a = Link { value = 1; next = b; prev = End }
  and     b = Link { value = 2; next = c; prev = a   }
  and     c = Link { value = 3; next = End; prev = b } in
  a

We can represent '_123' with the following image:

enter image description here

That's fine, but it's like nothing without the ability to create lists dynamically (with any amount of items). So, I declared the function:

let add x = function
  | End       -> Link { value = x; next = End; prev = End }
  | Link tail -> let rec head = Link { value = x; next = tail'; prev = End }
                 and tail' = Link { tail with prev = head } in
                 head

and tried to use it:

let _123 = 
  End |> add 3 
      |> add 2 
      |> add 1

the result isn't what I expected: enter image description here enter image description here

It contains extra items and corrupted links. I understand why it is so. It is impossible to create a new version of the list based on the "tail" because of producing extra items. It looks like there is only one way to do that - it is to recreate the list from scratch for every newly added item. Ok. But how to do that? How to implement an add function for the immutable doubly linked list?


Solution

  • Immutable doubly-linked list cannot be created incrementally. Consider the single-list element list:

    let singleton value = Link { value; next = End; prev = End }
    

    Since the prev element of this list is End, it cannot be the part of any list with more than one element. The same issue exists with list of length n: such list does not contain any sublist which is a list of length k < n.

    In other words to create a list with n elements, one need to create n link of the list simultaneously, which is not possible without mutation. In particular, mutually recursive definitions are implemented using a single mutation initialization under the hood.

    But this also means, that if we accept this constraint, we can have doubly-link persistent list by using observably-invisible mutations:

    type 'a l =
      | End
      | Link of { prev: 'a l; value:'a; next: unit -> 'a l }
    

    Here, replacing the type of next by unit -> 'a l gives us just enough breathing room to create the list step by step:

    let create l =
      (* precondition: when [create ref_to_prev q] is called [!ref_to_prev ()] will
         be a valid link *)
      let rec create ref_to_prev q () =
        match q with
        | [] -> End
        | value :: q ->
          (* dummy value *)
          let future_self = ref (fun () -> raise Non_initialized) in
          (* next cannot be called yet, but it has not been published yet,
             thus this is fine *)
          let next () = create future_self q in
          let self = Link {
            prev = (* precondition: this is now a valid value *)
              !ref_to_prev ();
            value;
            next;
          } in
          (* we backpatch the [prev] argument for [next], now that we have a [self] link *)
          future_self := (fun () -> self);
          (* [next] can now be called, and [self] is a valid link
             that we can publish to the outside world *)
          self
      in
      create (ref (Fun.const End)) l ()
    

    Here, create has the expected type 'a list -> 'a l. We can check that we have the right digraph structure with

    let rec backward_list = function
      | End -> []
      | Link l ->
        l.value :: backward_list l.prev
    
    
    let rec forth_and_back = function
      | End -> []
      | Link lk as l ->
        match lk.next () with
        | End -> backward_list l
        | l ->  lk.value :: forth_and_back l
    
    let test = forth_and_back (create [1;2;3;4])
    

    which returns [1;2;3;4;3;2;1] as expected.

    We have us a persistent doubly-linked list: no mutation can be observed from the outside but the elements of the list are in fact stored in the thunk:

    let next () = create future_self q in
    

    and we constructing the next element of the list on the fly whenever we call the next field. We could build the list once, by replacing

      next: unit -> 'a l;
    

    by

      next: 'a l Lazy.t;
    

    in the definition of the type constructor 'a l.