I've been looking at Haskell and I'd quite like to write a compiler in it (as a learning exercise), since a lot of its innate features can be readily applied to a compiler (particularly a recursive descent compiler).
What I can't quite get my head around is how to represent a language's grammar in a Haskell-ian way. My first thought was to use recursive data type definitions, but I can't see how I use them to match against keywords in the language ("if") for example.
Thoughts and suggestions greatly appreciated,
Pete
A recursive data type is fine for this. For example, given the language:
expr ::= var
| "true"
| "false"
| "if" expr "then" expr "else" expr
| "(" expr ")"
an example expression in this language would be:
if true then x else (if false then y else true)
Your Haskell data type would look something like this:
data Expr = Var String
| Lit Bool
| If Expr Expr Expr
Your parser then takes care to translate, e.g., x
into Var "x"
, and true
into Lit True
, etc. I.e.:
parse "if x then false else true"
== If (Var "x") (Lit False) (Lit True)
For writing parsers you can roll your own using the techniques mentioned in Norman's answer, or using Parsec or use parser generators like Happy.