I'm currently writing an expression parser. I've done the lexical and syntactic analysis and now I'm checking the types. I have the expression in a data structire like this (simplified version):
data Expr = EBinaryOp String Expr Expr
| EInt Int
| EFloat Float
And now I need a function which would convert this to a new type, say TypedExpr
, which would also contain type information. And now my main problem is, how this type should look like. I have two ideas - with type parameter:
data TypedExpr t = TEBinaryOp (TBinaryOp a b t) (TExpr a) (TExpr b)
| TEConstant t
addTypes :: (ExprType t) => Expr -> TypedExpr t
or without:
data TypedExpr = TEBinaryOp Type BinaryOp TypedExpr TypedExpr
| TEConstant Type Dynamic
addTypes :: Expr -> TypedExpr
I started with the first option, but I ran into problems, because this approach assumes that you know type of the expression before parsing it (for me, it's true in most cases, but not always). However, I like it, because it lets me use Haskell's type system and check for most errors at compile time.
Is it possible to do it with the first option?
Which one would you choose? Why?
What problems should I expect with each option?
The type of your function
addTypes :: Expr -> TypedExpr t
is wrong, because it would mean that you get a TypedExpr t
for any t
you like. In contrast, what you actually want is one particular t
that is determined by the argument of type Expr
.
This reasoning already shows that you are going beyond the capabilities of the Hindley-Milner type system. After all, the return type of addTypes
should depend on the value of the argument, but in plain Haskell 2010, types may not depend on values. Hence, you need an extension of the type system that brings you closer to dependent types. In Haskell, generalized algebraic data types (GADTs) can do that.
For a first introduction to GADTs, see also my video on GADTs.
However, after becoming familiar with GADTs, you still have the problem of parsing an untyped expression into a typed one, i.e. to write a function
addTypes :: Expr -> (exists t. TypedExpr t)
Of course, you have to perform some type checking yourself, but even then, it is not easy to convince the Haskell compiler that your type checks (which happen on the value level) can be lifted to the type level. Fortunately, other people have already thought about it, see for example the following message in the haskell-cafe mailing list:
Edward Kmett.
Re: Manual Type-Checking to provide Read instances for GADTs. (was Re: [Haskell-cafe] Read instance for GATD)
http://article.gmane.org/gmane.comp.lang.haskell.cafe/76466
(Does anyone know of a formally published / nicely written up reference?)