Search code examples
haskellocamllambda-calculuscombinatory-logic

SystemT Compiler and dealing with Infinite Types in Haskell


I'm following this blog post: http://semantic-domain.blogspot.com/2012/12/total-functional-programming-in-partial.html

It shows a small OCaml compiler program for System T (a simple total functional language).

The entire pipeline takes OCaml syntax (via Camlp4 metaprogramming) transforms them to OCaml AST that is translated to SystemT Lambda Calculus (see: module Term) and then finally SystemT Combinator Calculus (see: module Goedel). The final step is also wrapped with OCaml metaprogramming Ast.expr type.

I'm attempting to translate it to Haskell without the use of Template Haskell.

For the SystemT Combinator form, I've written it as

{-# LANGUAGE GADTs #-}

data TNat = Zero | Succ TNat

data THom a b where
  Id :: THom a a
  Unit :: THom a ()
  ZeroH :: THom () TNat
  SuccH :: THom TNat TNat
  Compose :: THom a b -> THom b c -> THom a c
  Pair :: THom a b -> THom a c -> THom a (b, c)
  Fst :: THom (a, b) a
  Snd :: THom (a, b) b
  Curry :: THom (a, b) c -> THom a (b -> c)
  Eval :: THom ((a -> b), a) b -- (A = B) * A -> B
  Iter :: THom a b -> THom (a, b) b -> THom (a, TNat) b

Note that Compose is forward composition, which differs from (.).

During the translation of SystemT Lambda Calculus to SystemT Combinator Calculus, the Elaborate.synth function tries to convert SystemT Lambda calculus variables into series of composed projection expressions (related to De Brujin Indices). This is because combinator calculus doesn't have variables/variable names. This is done with the Elaborate.lookup which uses the Quote.find function.

The problem is that with my encoding of the combinator calculus as the GADT THom a b. Translating the Quote.find function:

  let rec find x  = function
    | [] -> raise Not_found
    | (x', t) :: ctx' when x = x' -> <:expr< Goedel.snd >> 
    | (x', t) :: ctx' -> <:expr< Goedel.compose Goedel.fst $find x ctx'$ >>

Into Haskell:

find :: TVar -> Context -> _
find tvar [] = error "Not Found"
find tvar ((tvar', ttype):ctx)
  | tvar == tvar' = Snd
  | otherwise     = Compose Fst (find tvar ctx)

Results in an infinite type error.

• Occurs check: cannot construct the infinite type: a ~ (a, c)
  Expected type: THom (a, c) c
    Actual type: THom ((a, c), c) c

The problem stems from the fact that using Compose and Fst and Snd from the THom a b GADT can result in infinite variations of the type signature.

The article doesn't have this problem because it appears to use the Ast.expr OCaml thing to wrap the underlying expressions.

I'm not sure how to resolve in Haskell. Should I be using an existentially quantified type like

data TExpr = forall a b. TExpr (THom a b)

Or some sort of type-level Fix to adapt the infinite type problem. However I'm unsure how this changes the find or lookup functions.


Solution

  • This answer will have to be a bit high-level, because there are three entirely different families of possible designs to fix that problem. What you’re doing seems closer to approach three, but the approaches are ordered by increasing complexity.

    The approach in the original post

    Translating the original post requires Template Haskell and partiality; find would return a Q.Exp representing some Hom a b, avoiding this problem just like the original post. Just like in the original post, a type error in the original code would be caught when typechecking the output of all the Template Haskell functions. So, type errors are still caught before runtime, but you will still need to write tests to find cases where your macros output ill-typed expressions. One can give stronger guarantees, at the cost of a significant increase in complexity.

    Dependent typing/GADTs in input and output

    If you want to diverge from the post, one alternative is to use “dependent typing” throughout and make the input dependently-typed. I use the term loosely to include both actually dependently-typed languages, actual Dependent Haskell (when it lands), and ways to fake dependent typing in Haskell today (via GADTs, singletons, and what not). However, you lose the ability to write your own typechecker and choose which type system to use; typically, you end up embedding a simply-typed lambda calculus, and can simulate polymorphism via polymorphic Haskell functions that can generate terms at a given type. This is easier in dependently-typed languages, but possible at all in Haskell.

    But honestly, in this road it’s easier to use higher-order abstract syntax and Haskell functions, with something like:

    data Exp a where
      Abs :: (Exp a -> Exp b) -> Exp (a -> b)
      App :: Exp (a -> b) -> Exp a -> Exp b
      Var :: String -> Exp a —- only use for free variables
    exampleId :: Exp (a -> a)
    exampleId = Abs (\x -> x)
    

    If you can use this approach (details omitted here), you get high assurance from GADTs with limited complexity. However, this approach is too inflexible for many scenarios, for instance because the typing contexts are only visible to the Haskell compiler and not in your types or terms.

    From untyped to typed terms

    A third option is go dependently-typed and to still make your program turn weakly-typed input to strongly typed output. In this case, your typechecker overall transforms terms of some type Expr into terms of a GADT TExp gamma t, Hom a b, or such. Since you don’t know statically what gamma and t (or a and b) are, you’ll indeed need some sort of existential.

    But a plain existential is too weak: to build bigger well-typed expression, you’ll need to check that the produced types allow it. For instance, to build a TExpr containing a Compose expression out of two smaller TExpr, you'll need to check (at runtime) that their types match. And with a plain existential, you can't. So you’ll need to have a representation of types also at the value level.

    What's more existentials are annoying to deal with (as usual), since you can’t ever expose the hidden type variables in your return type, or project those out (unlike dependent records/sigma-types). I am honestly not entirely sure this could eventually be made to work. Here is a possible partial sketch in Haskell, up to one case of find.

    data Type t where
      VNat :: Type Nat
      VString :: Type String
      VArrow :: Type a -> Type b -> Type (a -> b)
      VPair :: Type a -> Type b -> Type (a, b) 
      VUnit :: Type ()
    data SomeType = forall t. SomeType (Type t)
    data SomeHom = forall a b. SomeHom (Type a) (Type b) (THom a b)
    
    type Context = [(TVar, SomeType)] 
    getType :: Context -> SomeType
    getType [] = SomeType VUnit 
    getType ((_, SomeType ttyp) :: gamma) =  
       case getType gamma of
           SomeType ctxT -> SomeType (VPair ttyp
    find :: TVar -> Context -> SomeHom 
    find tvar ((tvar’, ttyp) :: gamma)
       | tvar == tvar’ =
           case (ttyp, getType gamma) of
             (SomeType t, SomeType ctxT) ->
              SomeHom (VPair t ctxT) t Fst