Search code examples
ocamlgadtlocally-abstract-type

How do I Pattern-match GADT Types Get Alternative More Concrete Return Types?


I want to exploit GADT to implement the type ('a, 'b) liInstr_t in order to hold various types of instructions which are recursively decoded into basic operations (if need be) that then executed. (Eventually, I should construct more abstract types over them but in a mechanical, compositional and scripted fashion.) Unfortunately, I have difficulties associating the locally abstract types from pattern-matching function argument with the alternative concrete return types desired for the GADT.

I believe I'm missing something fundamental or making wrong assumptions though I have looked at the ocaml 4.10.0 manual on locally abstract types and gadts, the Real world Ocaml book, and the responses to similar questions such as here and here. This is because I seem to follow their explanations but cannot somehow apply them to my task.

From the above, I understand that polymorphic type variables annotate functions so that they can take on (be unified with) arbitrary types compatible with their constraints, and that locally abstract types let us have different types along alternative paths through a pattern-matching expression, say. Also, the local abstract types cannot be unified but can be refined to concrete types compatible with GADT result types. As such, GADTs can recurse over polymorphic types, and aggregate multiple result types into a single sum type.

I deliberately let the type ('a, 'b) liInstr_t have two type variables (so I can add more later), and its variants capture various constraint formats and scenarios I may have to use together.

type
  liLabel_t = string        (* Instruction name (label) *)
and
  context_t = string        (* TODO: execution context *)
and
  'a context_list_t = 'a list
and
  'a liChooser_t = 'a -> int    (* get index of i-th list entry *)
and
  ('a, 'b) liInstr_t =
    LiExec: 'a -> ('a, 'b) liInstr_t        (* executable operation *)
  | LiExecTRY: ('a, _) liInstr_t        (* Ignore: Experiment on GADT *)
  | LiLab: liLabel_t -> ('a, 'b) liInstr_t  (* instruction label *)
  | LiLabTRY: (liLabel_t, _) liInstr_t      (* Ignore: Experiment on GADT *)
  | LiSeq: 'a liChooser_t * 'b list -> ('a, 'b) liInstr_t   (* sequence *)
  | LiAlt: 'a liChooser_t * 'b list -> ('a, 'b) liInstr_t   (* choice *)
  | LiLoop: 'a liChooser_t * 'b list -> ('a, 'b) liInstr_t  (* loop *)
  | LiName: 'a liChooser_t * liLabel_t * 'b context_list_t ->
    ('a, 'b) liInstr_t              (* change context *)
  | Err_LiInstr: ('a, 'b) liInstr_t     (* error handling *)
  | Nil_LiInstr: ('a, 'b) liInstr_t     (* no action *)

After experimenting, the sample function used is:

let ft1:  type  b c. (b, c) liInstr_t -> b = function
(* *)  | LiExec n -> n
(* *)  | LiExecTRY -> "4"
(* *)  | LiLab s -> "LiLab" 
(* *)  | LiLabTRY -> "LiLabTRY"
(* *)  | LiSeq (f, il) -> "LiSeq" 
(* *)  | LiAlt (f, il) -> "LiAlt" 
(* *)  | LiLoop (f, il) -> "LiLoop"
(* *)  | LiName (f, il, ic) -> "LiName"
(* *)  | Err_LiInstr -> "Err_LiInstr"
(* *)  | Nil_LiInstr -> "Nil_LiInstr"
;;

and it gave the error:

Line 3, characters 22-25:
3 | (* *)  | LiExecTRY -> "4"
                          ^^^
Error: This expression has type string but an expression was expected of type
         b

I still got errors when I changed the function annotation (and typing), or commented out some alternatives in the function pattern matching and the GADT type variants. Some of the errors (elided for brevity) were obtained as follows:

Using an extra locally-typed variable:

let ft1 :  type b c d. (b, c) liInstr_t -> d = function ...
2 | (* *)  | LiExec n -> n
                         ^
Error: This expression has type b but an expression was expected of type d

Using only polymorphic type variables:

let ft1:  'b 'c. ('b, 'c) liInstr_t -> 'b = function ...

Error: This definition has type 'c. (liLabel_t, 'c) liInstr_t -> liLabel_t
       which is less general than 'b 'c. ('b, 'c) liInstr_t -> 'b

My questions then are the following:

  1. How can we capture and use the abstract types identified with alternative paths? A locally abstract type should bind (or be refined) to compatible concrete type(s) for values found in the resulting expression, or can be ignored, right? Ignoring recursion, this example:
let rec eval : type a. a term -> a = function
  | Int n    -> n                 (* a = int *)
  | Add      -> (fun x y -> x+y)  (* a = int -> int -> int *)
  | App(f,x) -> (eval f) (eval x)
          (* eval called at types (b->a) and b for fresh b *)

on expression evaluation in the ocaml manual seems to suggest that is the case, at least for a 1-parameter GADT type. So, why aren't my types b and c not suitably bound (or refined) in the return type? And if they are binding (or being refined), which should bind to abstract type b and which to c, if at all? How can I find values for my return type so they can correctly associate with the abstract, value-less types reaching them. For, there is seems no way to obtain a result that has the type b in my first error above!

  1. Why am I forced to have the same result type for the alternative paths to succeed (string type in my examples) whereas all possible result types of the GADT should be admissible. In this regard, the first variant (LiExec n -> n) seemed forced to have type string! Also, the abstract types and polymorphic variables along execution paths seem irrelevant to the result type!

I could not reproduce it but at one point, making the first variant LiExec n -> 4 seemed to require integer return values from all alternative pattern matches. If indeed this is the case, why should abstract types on alternative paths require values from the same non-GADT return type? (This behaviour is of non-polymorphic types, right?)

  1. To work around incomprehensible issues on polymorphic types and locally abstract types, is there a simple way to mix them in a constraint? Various permutations to mix them always seem to result in a syntax error. e.g.:
let ft1: (type d)  'b 'c. ('b, 'c) liInstr_t -> d = function
          ^^^^
Error: Syntax error

Solution

  • In brief, the type li_Instr_t as defined is not an interesting GADT and it can be rewritten to the strictly equivalent ADT

    type  ('a, 'b) liInstr_t =
      | LiExec of 'a
      | LiExecTRY
      | LiLab of liLabel_t
      | LiLabTRY
      | LiSeq of 'a liChooser_t * 'b list 
      | LiAlt of 'a liChooser_t * 'b list
      | LiLoop of 'a liChooser_t * 'b list
      | LiName of 'a liChooser_t * liLabel_t * 'b context_list_t
      | Err_LiInstr
      | Nil_LiInstr
    

    because the type declaration never introduces equations (or existential quantifications) between the result type and the constructor of the GADT.

    If we look at a simple example for GADT:

    type ('elt,'array) compact_array =
    | String: (char, String.t) compact_array
    | Float_array: (float, Float.Array.t) compact_array
    | Standard: ('a, 'a array) compact_array
    
    let init: type elt array.
      (elt,array) compact_array -> int -> (int -> elt) -> array = 
    fun kind n f -> match kind with
    | String -> String.init n f
    | Float_array -> Float.Array.init n f
    | Standard -> Array.init n f
    

    The difference is that the constructor String constrains the type of compact_array to be (char,string) compact_array. Thus, when I observe String in the pattern matching above, I can introduce the equation elt=char and array=string in the branch String and use those equation locally . Similarly, after observing the constructor Float_array in the pattern matching, I can work with the equation elt=float and array=Float.Array.t inside the corresponding branch.

    Contrarily, with the definition of liInstr_t as it stands, observing a constructor of a value of type ('a,'b) liInstr_t brings no information on the type ('a,'b) liInstr_t. Consequently, the function ft1 of type type a b. (a,b) liInstr_t -> b is is promising to return a float array when called with ft1 (LiExecTRY:('a,float array) li_Instr_t). More generally, a function of type a b. (a,b) liInstr_t -> awhere no constructor impose a constraint onbis necessarily returning some value of typebthat was contained inside(a,b) liInstr_t` (or is not returning).

    Using that knowledge, we can update your type liInstr_t to make the function ft1 works by adding the equations corresponding to the expected return type for ft1 to the definition of the type:

    type liLabel_t = string
    and context_t = string
    and 'a context_list_t = 'a list
    and 'a liChooser_t = 'a -> int
    and ('a, 'b, 'ft1) liInstr_t =
    | LiExec: 'a -> ('a, 'b,'a) liInstr_t    (* ft1 returns the argument of LiExec *)
     (* ft1 returns a string in all other cases *)
    | LiExecTRY: ('a, 'b, string) liInstr_t   
    | LiLab: liLabel_t -> ('a, 'b, string) liInstr_t 
    | LiLabTRY: (liLabel_t, 'b, string) liInstr_t
    | LiSeq: 'a liChooser_t * 'b list -> ('a,'b, string) liInstr_t
    | LiAlt: 'a liChooser_t * 'b list -> ('a,'b, string) liInstr_t
    | LiLoop: 'a liChooser_t * 'b list -> ('a,'b, string) liInstr_t
    | LiName: 'a liChooser_t * liLabel_t * 'b context_list_t ->
        ('a,'b, string) liInstr_t
    | Err_LiInstr: ('a, 'b, string) liInstr_t
    | Nil_LiInstr: ('a, 'b, string) liInstr_t
    

    and now that we have the right equation in place, we can define ft1 as:

    let ft1:  type  a b c. (a, b, c) liInstr_t -> c = function
    | LiExec n -> n
    | LiExecTRY -> "4"
    | LiLab s -> "LiLab" 
    | LiLabTRY -> "LiLabTRY"
    | LiSeq (f, il) -> "LiSeq" 
    | LiAlt (f, il) -> "LiAlt" 
    | LiLoop (f, il) -> "LiLoop"
    | LiName (f, il, ic) -> "LiName"
    | Err_LiInstr -> "Err_LiInstr"
    | Nil_LiInstr -> "Nil_LiInstr"
    

    which typechecks without any error.