When you write a Happy description, you have to define all possible types of token that can appear. But you can only match against token types, not individual token values...
This is kind of problematic. Consider, for example, the data
keyword. According to the Haskell Report, this token is a "reservedid". So my tokeniser recognises it and marks it as such. However, consider the as
keyword. Now it turns out that this is not a reservedid; it's an ordinary varid. It's only special in one context. You can totally declare a normal variable named as
, and it's fine.
So here's a question: How do I parse as
specifically?
Initially I didn't really think about it. I just defined a new token type which represents any varid token who's text happens to be as
.
...and then I spent about 2 hours trying to work out why the hell my grammar doesn't actually work. Yeah, it turns out that since this token type overlaps with an existing token type, the declaration order is significant. (!!!) Literally, changing the order of the declarations made the grammar parse perfectly.
But now I'm worried. I fear that as
will never be matched as a varid and will only ever match as itself. So all the grammar rules that say varid will reject the as
token — which is completely wrong!
What is the correct way to fix this?
What GHC does in its Parser.y
is to define a nonterminal token type special_id
that lists many of the special non-keywords like as
, and then define the tyvarid
and varid
(nonterminal) tokens to include that as an option besides the terminal VARID
(and some others, although most of them look to me like they should have been put in special_id
too).
An excerpt:
varid :: { Located RdrName }
: VARID { sL1 $1 $! mkUnqual varName (getVARID $1) }
| special_id { sL1 $1 $! mkUnqual varName (unLoc $1) }
| 'unsafe' { sL1 $1 $! mkUnqual varName (fsLit "unsafe") }
...
special_id :: { Located FastString }
special_id
: 'as' { sL1 $1 (fsLit "as") }
| 'qualified' { sL1 $1 (fsLit "qualified") }
| 'hiding' { sL1 $1 (fsLit "hiding") }
| 'export' { sL1 $1 (fsLit "export") }
...