I have an ADT representing the AST for a simple language:
data UTerm = UTrue
| UFalse
| UIf UTerm UTerm UTerm
| UZero
| USucc UTerm
| UIsZero UTerm
This data structure can represent invalid terms that don't follow the type
rules of the language, like UIsZero UFalse
, so I'd like to use a GADT that
enforces well-typedness:
{-# LANGUAGE GADTs #-}
data TTerm a where
TTrue :: TTerm Bool
TFalse :: TTerm Bool
TIf :: TTerm Bool -> TTerm a -> TTerm a -> TTerm a
TZero :: TTerm Int
TSucc :: TTerm Int -> TTerm Int
TIsZero :: TTerm Int -> TTerm Bool
My problem is to type check a UTerm and convert it into a TTerm. My first
thought was UTerm -> Maybe (TTerm a)
, but this of course doesn't work because
it's not valid for all a
s. I don't even know what the type would be, because
we don't know if a
is going to be Int or Bool. Then I thought I could write a
different type checking function for each of the possible values of a
:
import Control.Applicative
typecheckbool :: UTerm -> Maybe (TTerm Bool)
typecheckbool UTrue = Just TTrue
typecheckbool UFalse = Just TFalse
typecheckbool (UIsZero a) = TIsZero <$> typecheckint a
typecheckbool _ = Nothing
typecheckint :: UTerm -> Maybe (TTerm Int)
typecheckint UZero = Just TZero
typecheckint (USucc a) = TSucc <$> typecheckint a
typecheckint (UIf a b c) = TIf <$> typecheckbool a <*> typecheckint b <*> typecheckint c
typecheckint UTrue = Nothing
typecheckint UFalse = Nothing
typecheckint (UIsZero _) = Nothing
This works for some cases, for a subset of the language where TIf requires its
consequent and alternative are Ints (But TIf TTrue TFalse TTrue
is actually
totally valid), and where we know the target type of the expression we're
typing.
What's the right way to convert from a UTerm to a TTerm?
The standard technique is to define an existential type:
data ETerm_ where
ETerm_ :: TTerm a -> ETerm
In this case, you may also want some term-level evidence of which type you have; e.g.
data Type a where
TInt :: Type Int
TBool :: Type Bool
then the real ETerm
would look like this:
data ETerm where
ETerm :: Type a -> TTerm a -> ETerm
The interesting case of type checking is then something like
typeCheck (UIf ucond ut uf) = do
ETerm TBool tcond <- typeCheck ucond
ETerm tyt tt <- typeCheck ut
ETerm tyf tf <- typeCheck uf
case (tyt, tyf) of
(TBool, TBool) -> return (ETerm TBool (TIf tcond tt tf))
(TInt , TInt ) -> return (ETerm TInt (TIf tcond tt tf))
_ -> fail "branches have different types"