Search code examples
parsinglr-grammarlemon

Generate the LR parse table from the Lemon Parser Generator


I'm trying to generate the parser table using the lemon parser generator, but the .out file generated when I run lemon grammar.y only contains the states of the automaton.

Is there a way to also get the goto table for non-terminals, not only the states of the automaton? Or this can only be done by reading the generated code? Are there any other tools that can generate both the action and the goto tables?

PS:

The .out file (generated by lemon) for a simple grammar looks like this:

State 0:
          start ::= * e
          e ::= * e PLUS t
          e ::= * t
          t ::= * t MUL f
          t ::= * f
          f ::= * LPAR e RPAR
          f ::= * ID

                          LPAR shift        1      
                            ID shift        4      
                         start accept
                             e shift        11     
                             t shift        6      
                             f shift        5      

State 1:
          e ::= * e PLUS t
          e ::= * t
          t ::= * t MUL f
          t ::= * f
          f ::= * LPAR e RPAR
          f ::= LPAR * e RPAR
          f ::= * ID

                          LPAR shift        1      
                            ID shift        4      
                             e shift        10     
                             t shift        6      
                             f shift        5      

State 2:
          e ::= e PLUS * t
          t ::= * t MUL f
          t ::= * f
          f ::= * LPAR e RPAR
          f ::= * ID

                          LPAR shift        1      
                            ID shift        4      
                             t shift        9      
                             f shift        5      

State 3:
          t ::= t MUL * f
          f ::= * LPAR e RPAR
          f ::= * ID

                          LPAR shift        1      
                            ID shift        4      
                             f shift        8      

State 4:
      (6) f ::= ID *

                             $ reduce       6      f ::= ID
                          PLUS reduce       6      f ::= ID
                           MUL reduce       6      f ::= ID
                          RPAR reduce       6      f ::= ID

State 5:
      (4) t ::= f *

                             $ reduce       4      t ::= f
                          PLUS reduce       4      t ::= f
                           MUL reduce       4      t ::= f
                          RPAR reduce       4      t ::= f

State 6:
      (2) e ::= t *
          t ::= t * MUL f

                             $ reduce       2      e ::= t
                          PLUS reduce       2      e ::= t
                           MUL shift        3      
                          RPAR reduce       2      e ::= t

State 7:
      (5) f ::= LPAR e RPAR *

                             $ reduce       5      f ::= LPAR e RPAR
                          PLUS reduce       5      f ::= LPAR e RPAR
                           MUL reduce       5      f ::= LPAR e RPAR
                          RPAR reduce       5      f ::= LPAR e RPAR

State 8:
      (3) t ::= t MUL f *

                             $ reduce       3      t ::= t MUL f
                          PLUS reduce       3      t ::= t MUL f
                           MUL reduce       3      t ::= t MUL f
                          RPAR reduce       3      t ::= t MUL f

State 9:
      (1) e ::= e PLUS t *
          t ::= t * MUL f

                             $ reduce       1      e ::= e PLUS t
                          PLUS reduce       1      e ::= e PLUS t
                           MUL shift        3      
                          RPAR reduce       1      e ::= e PLUS t

State 10:
          e ::= e * PLUS t
          f ::= LPAR e * RPAR

                          PLUS shift        2      
                          RPAR shift        7      

State 11:
      (0) start ::= e *
          e ::= e * PLUS t

                             $ reduce       0      start ::= e
                          PLUS shift        2      

----------------------------------------------------
Symbols:
    0: $:
    1: PLUS
    2: MUL
    3: LPAR
    4: RPAR
    5: ID
    6: error:
    7: start: LPAR ID
    8: e: LPAR ID
    9: t: LPAR ID
   10: f: LPAR ID

Solution

  • Lemon outputs the action table and the goto table in a single block. The goto function looks like shift actions, except that the lookahead is a non-terminal rather than a terminal.

    So if we take State 0:

     LPAR shift        1      
       ID shift        4      
    start accept
        e shift        11     
        t shift        6      
        f shift        5      
    

    The first two lines are the actions on reading LPAR and ID, respectively. The remaining lines are the goto function, which is used when a reduce action reveals this state by popping the stack. (Unlike a traditional LR machine, in Lemon the accept action is in the goto table rather than in the action table for the end-of-input pseudo-terminal.)

    Although most descriptions of the LR parser distinguish between the action table and the goto table, there is very little difference between a "shift" action and the "goto" part of a reduce action. Both of these push the current state number and a symbol onto the parser stack. The difference is that a reduce action (such as reduce 6, which means "reduce using production 6" -- it has nothing to do with state 6) first pops the right-hand side of the indicated production off of the stack and sets the current state to the newly-revealed state on the top of the stack before executing the shift/goto. (Another difference is that after a shift action, it is necessary to read a new lookahead token, whereas the reduce action does not consume the input.)