Search code examples
ocamlgadt

What does this type error on heterogeneous lists means?


I have a heterogeneous list and a function on them

type ('ls,'tl) hlist =
  | Nil : ('a,'a) hlist
  | Cons: 'a * ('l, 't) hlist -> ('a -> 'l, 't) hlist

let rec headlist l = 
  match l with
  | Cons (h, t) -> Cons (head h, headlist t)
  | Nil -> Nil

and would like to traverse a hlist of lists of different types, and build a list of the heads of each list. The idea is something like this:

headlist Cons( [1,2,3], Cons( ['a','b'], Cons( [true,false], Nil ))) 
= Cons( 1, Cons( 'a', Cons( true, Nil)))

However, I encounter a type error.

Error: This expression has type ('a, 'b list -> 'a) hlist
       but an expression was expected of type
         ('b list -> 'a, 'b list -> 'a) hlist
       The type variable 'a occurs inside 'b list -> 'a

I don't understand the type error. What is it saying? Am I trying to do something impossible?


Solution

  • Your problem start with the fact that it is not possible to write a type for the headlist function that you have in mind. Since it is in general necessary to write explicitly the type of functions manipulating GATDs, it is good practice to start writing this type, to check that one can write the type; and only remove it afterward in the rare cases where it is possible to elide the explicit type annotations.

    The root of the issue here is that heterogeneous lists are much more rigid than normal lists. In particular, depending on the operations needed on such lists, it is frequent to have to tailor specialized heterogeneous list types. For instance, with the classical heterogeneous list:

    type void = |
    module Hlist = struct
      type 'a t =
        | []: void t
        | (::): 'a * 'l t -> ('a -> 'l) t
    
      let head(x::_) = x
      let rec length: type a. a t -> int = function
        | [] -> 0
        | a :: q -> 1 + length q 
    end
    

    it is impossible to express the condition: all elements of the heterogeneous list are heterogeneous lists with at least one element themselves. However, it is possible to define another list type that does enforce this condition:

    module Hlist_of_nonempty_hlist_0 = struct
      type 'a t =
        | []: void t
        | (::): (('h -> 'more) as 'a) Hlist.t * 'l t -> ('a -> 'l) t
    end
    

    With this new list type, I can compute the length of all nested lists:

    let rec map_length: type a. a Hlist_of_nonempty_hlist_0 t -> int list = function
     | [] -> []
     | a :: q -> Hlist.length a :: map_length q 
    

    However, I can still not apply head to all elements, because the types of the head are not easily accessible. One option is to store those types directly in the type of Hlist_of_nonempty_hlist:

    module Hlist_of_nonempty_hlist = struct
      type ('a,'b) t =
        | []: (void,void) t
        | (::):
           (('h -> 'more) as 'a) Hlist.t * ('l,'hl) t
           -> ('a -> 'l, 'h -> 'hl) t
    end
    

    and with this specialized heterogeneous list type, writing the type of map_head becomes straightforward:

    let rec map_head:
      type l hl. (l, hl) Hlist_of_nonempty_hlist.t -> hl Hlist.t
      = function
      | [] -> []
      | (a::_) :: q -> a :: map_head q
    

    But this is a lot of design work on the type for one function. And going further and trying to write any generic functions over heterogeneous lists generally require a lot of polymorphic records and functors.