Search code examples
haskellhappy

syntax happy parser meaning


I have this grammar section in a happy parser, given on the Happy official site, but I need some deeper explanation of the meaning of the rules in brackets. Here is the token definition

%token 
      let             { TokenLet }
      in              { TokenIn }
      int             { TokenInt $$ }
      var             { TokenVar $$ }
      '='             { TokenEq }
      '+'             { TokenPlus }
      '-'             { TokenMinus }
      '*'             { TokenTimes }
      '/'             { TokenDiv }
      '('             { TokenOB }
      ')'             { TokenCB }

and here the grammar section

Exp   : let var '=' Exp in Exp  { Let $2 $4 $6 }
          | Exp1                    { Exp1 $1 }

Exp1  : Exp1 '+' Term           { Plus $1 $3 }
      | Exp1 '-' Term           { Minus $1 $3 }
      | Term                    { Term $1 }

Term  : Term '*' Factor         { Times $1 $3 }
      | Term '/' Factor         { Div $1 $3 }
      | Factor                  { Factor $1 }

Factor            
      : int                     { Int $1 }
      | var                     { Var $1 }
      | '(' Exp ')'             { Brack $2 }

What I understand is that the lexer, defined below in the file, should produce tokens only of the type definined and then build the parse tree using the grammar. But what exactley mean "{Let $2 $4 $6}"? I know that $2 refers to the second rule argument and so on but if someone can give me a "human read version" of the rules I would be really happy. Hope I've been clear.

Thanks in advance.


Solution

  • In the %token section the left column is the token names used elsewhere in the grammar, and the right is a pattern that can be used in a case statement. Where you see $$ Happy will substitute its own variable. So if the resulting parser is expecting an Integer at some point then Happy will have a case statement with a pattern including TokenInt v1234 where the v1234 bit is a variable name created by Happy.

    The "Let" is the constructor for the grammar expression being recognised. If you look a little lower in the example page you will see

    data Exp  
        = Let String Exp Exp
        | Exp1 Exp1
        deriving Show
    

    So the Let constructor takes a string and two sub-expressions (of type 'Exp'). If you look at the grammar you can see that there six elements in the let rule. The first is just the constant string "let". That is used by the generated parser to figure out that its looking at a "let" clause, but the resulting parse tree doesn't need it. So $1 doesn't appear. Instead the first argument to the Let constructor has to be the name of the variable being bound, which is the second item in the grammar rule. Hence this is $2. The other things are the two sub-expressions, which are $4 and $6 by the same logic. Both of these can be arbitrarily complex expressions: Happy figures out where they begin and end and parses them by following the other rules for what constitutes expressions.