Search code examples
functional-programmingocaml

Flattening a Nested List in OCaml


I'm doing problem 7 of the OCaml exercises. It basically asks given a nested list as defined:

type 'a node =
  | One of 'a 
  | Many of 'a node list

Write a function function flatten that flattens it:

assert (
  flatten [ One "a"; Many [ One "b"; Many [ One "c"; One "d" ]; One "e" ] ]
  = [ "a"; "b"; "c"; "d"; "e" ])

The solution provided by the website is as follows:

let flatten list =
  let rec aux acc = function
    | [] -> acc
    | One x :: t -> aux (x :: acc) t
    | Many l :: t -> aux (aux acc l) t
  in
  List.rev (aux [] list)
;;

I'm wondering why they do x :: acc and then reverse the list instead of acc @ [x] which would avoid the need to reverse the list at the end?


Solution

  • How does @ work? Well, let's define it.

    let rec (@) lst1 lst2 =
      match lst1 with 
      | [] -> lst2
      | x::xs -> x :: (xs @ lst2)
    

    So, in order to append one list to another, we have to iterate over all of the first list. If we do this within a loop, as that first list gets longer, it takes longer to get to the end of it. In big-O notation, this has O(n^2) or "quadratic" performance. Fine for (very) small sample sizes, but prohibitive for large sample sizes.

    However, :: occurs in constant time. It doesn't matter how big the list is, it will always take the same amount of time to cons a value onto the front of it. If we do that in a loop, we get O(n) or "linear" performance.

    Reversing a list also works in linear time. Is it costly to run two linear time operations? Certainly more so that running one, but consider two times n vs. n squared:

    n 2n n^2
    0 0 0
    1 2 1
    2 4 4
    3 6 8
    4 8 16
    100 200 10,000
    10,000 20,000 100,000,000
    1,000,000 2,000,000 1,000,000,000,000

    An additional note: the implementation of flatten shown by the OP which utilizes :: is not tail-recursive. The following would be:

    let flatten lst =
      let rec aux lst to_process acc =
        match lst, to_process with
        | [], [] -> acc
        | [], _ -> aux to_process [] acc
        | (One v)::xs, _ -> aux xs to_process (v :: acc)
        | (Many lst')::xs, _ -> aux lst' (xs @ to_process) acc
      in
      List.rev @@ aux lst [] [] 
    

    Following the evaluation of your sample data:

    flatten [One "a"; Many [One "b"; Many [One "c"; One "d"]; One "e"]]
    aux [One "a"; Many [One "b"; Many [One "c"; One "d"]; One "e"]] []        []
    aux [Many [One "b"; Many [One "c"; One "d"]; One "e"]]          []        ["a"]
    aux [One "b"; Many [ One "c"; One "d" ]; One "e" ]              []        ["a"]
    aux [Many [ One "c"; One "d"]; One "e"]                         []        ["b"; "a"]
    aux [One "c"; One "d"]                                          [One "e"] ["b"; "a"]
    aux [One "d"]                                                   [One "e"] ["c"; "b"; "a"]
    aux [One "e"]                                                   []        ["d"; "c"; "b"; "a"]
    aux []                                                          []        ["e"; "d"; "c"; "b"; "a"]
    List.rev ["e"; "d"; "c"; "b"; "a"]
    ["a"; "b"; "c"; "d"; "e"]