I'm writing an SQL interpreter. I need to distinguish between ill-formed expressions at compile time, and run-time errors.
I'll give you an example of something that should be well-formed, but possibly fails at run-time.
SELECT $ [ColumnName "first_name" `AS` "name"] `FROM` TABLE "people.csv" `WHERE` (ColumnName "age" `Gte` LiteralInt 40)
I'd like to focus in on the expression:
(ColumnName "age" `Gte` LiteralInt 40)
This should pass the type checker. But, say "age" did not contain something that could be expressed as a LiteralInt
.
So I would want Gte
to produce something like IO Bool
(not minding exception handling for now).
But I wouldn't always need Gte
to produce an IO Bool
. In the event I had something like this:
(LiteralInt 40 `Gte` LiteralInt 10)
I just need a Bool.
or something like this:
(LiteralInt 40 `Gte` LiteralBool True)
needs to fail at compile time.
So, I've been toying around with data families and GADTs, and have gone down many dead ends that would just serve to confuse the situation if I explicated them.
Does my problem make sense, and if so is there a path of inquiry I can traverse that will lead to a solution?
So I would want
Gte
to produce something likeIO Bool
(not minding exception handling for now).But I wouldn't always need
Gte
to produce anIO Bool
.
That is not possible (nor indeed desirable). Gte
will have to always return the same type. Also, you probably want to separate the construction of your query from its execution...
or something like this:
(LiteralInt 40 `Gte` LiteralBool True)
needs to fail at compile time.
Now that is much more reasonable. If you decide to go down that path, you can even customize the type errors GHC reports with the new TypeError
feature. However, sticking to a simple example involving just LiteralInt
, LiteralBool
and Gte
, you could do something like the following with just GADTs:
{-# LANGUAGE GADTs #-}
data Expr a where
LiteralInt :: Int -> Expr Int
LiteralBool :: Bool -> Expr Bool
Gte :: Expr Int -> Expr Int -> Expr Bool
Add :: Expr Int -> Expr Int -> Expr Int
ColumnName :: String -> Expr a
Then, the following would all compile:
ColumnName "age" `Gte` LiteralInt 40
LiteralInt 40 `Gte` LiteralInt 10
(LiteralInt 40 `Add` ColumnName "age") `Gte` LiteralInt 10
while the following would not:
LiteralInt 40 `Gte` LiteralBool True
LiteralInt 40 `Add` LiteralBool True
But, say "age" did not contain something that could be expressed as a
LiteralInt
.
You could potentially make this also compile time if you knew your schema at compile time and you wanted to do lots of type-hackery. The simpler solution would be just to make the error handling happen when you execute your query. So you would have a function looking something like
evalExpr :: Expr a -> ExceptT e IO a
And you would probably perform the appropriate checks concerning the types of columns here.