Search code examples
haskellparser-generatorshift-reduce-conflicthappy

How rewrite grammar to eliminate shift-reduce conflict (in Haskell Happy parser)


I'm trying to define grammar for methods (java like) using Happy LALR parser generator

  1.  MD ::= some_prefix { list(VD) list(S) }
  2.  VD ::= T I
  3.  S ::= I = E | I [ E ] = E | etc...
  4.  T ::= I | byte | int | etc...
  5.  E ::= INT | E + E | etc...

Here,

  • MD: Method Declaration
  • VD: Variable Declaration
  • S: Statement
  • T: Type
  • I: Identifier
  • E: Expression

All the other tokens are terminals.

Within the method, variable declarations are done in the top and after that the statements.

As you can see VD can starts from an I since there can be variable declarations of type class where the type is an identifier (I). Statement can also be started from an I because of assignments to variables and variable name is an I
The problem is VD and S both can starts from an I. Therefore, in the first production it cause an shift/reduce conflict.

Is there a way to re-write grammar or any other parser generator tricks to solve this problem?

I have specified the associativity and precedence for the operators. I have only mentioned the minimum set of information to explain the problem. Let me know if you need more information.


UPDATE:

Following is the grammar file

{
module Issue where
import Lexer
}

%name parser
%tokentype { Token }
%error { parseError }

%token
--===========================================================================
  ';'         { TokenSemi }
  id          { TokenId $$ }
  '{'         { TokenLBrace }
  '}'         { TokenRBrace }
  public      { TokenPublickw }
  '('         { TokenLParen }
  ')'         { TokenRParen }
  int         { TokenInt $$ }
  plus        { TokenPlus }
  inttype     { TokenIntkw }
  '='         { TokenAssign }
--===========================================================================

--===========================================================================
-- Precedence and associativity. Reference:
-- http://introcs.cs.princeton.edu/java/11precedence/
%right '='
%left plus
--===========================================================================

%%

MethodDecl :
  public id '(' ')' '{' list(VarDecl) list(Statement) '}'
  { MethodDecl $2 (VarList $6) (BlockStatement $7) }

DataType :
  inttype
  { DataType TypeInt }
  | id
  { DataType (TypeClass $1) }

VarDecl :
    DataType id ';'
  { VarDecl $1 $2 }

Statement :
  id '=' Expression ';'
  { Assignment $1 $3 }

Expression :
  int
  { IntLiteral $1 }

  | Expression plus Expression
  { PlusExp $1 $3 }

--============================================================================

list1(p) :
    p
  { [$1] }
  | p list1(p)
  { $1 : $2 }


list(p) :
    list1(p)
  { $1 }
  | -- epsilon
  { [] }
--============================================================================

{
data AST =  Goal AST [AST]
          | BlockStatement [AST]
          | IntLiteral Int
          | PlusExp AST AST
          | MethodDecl String AST AST
          | DataType MJType
          | Identifier String
          | VarList [AST]
          | VarDecl AST String
          | Assignment String AST
          deriving Show

data MJType = TypeInt
      | TypeUnknown
      | TypeClass String
      deriving (Show,Eq)




parseError :: [Token] -> a
parseError (t:ts) = error ("Parser Error: " ++ (show t))
}

.info file generated by Happy parser with the details of shift-reduce conflicts and states

-----------------------------------------------------------------------------
Info file generated by Happy Version 1.19.4 from issue.y
-----------------------------------------------------------------------------

state 7 contains 1 shift/reduce conflicts.
state 9 contains 1 shift/reduce conflicts.

-----------------------------------------------------------------------------
Grammar
-----------------------------------------------------------------------------
    %start_parser -> MethodDecl                        (0)
    MethodDecl -> public id '(' ')' '{' list(VarDecl) list(Statement) '}'   (1)
    DataType -> inttype                                (2)
    DataType -> id                                     (3)
    VarDecl -> DataType id ';'                         (4)
    Statement -> id '=' Expression ';'                 (5)
    Expression -> int                                  (6)
    Expression -> Expression plus Expression           (7)
    list(Statement) -> list1(Statement)                (8)
    list(Statement) ->                                 (9)
    list(VarDecl) -> list1(VarDecl)                    (10)
    list(VarDecl) ->                                   (11)
    list1(Statement) -> Statement                      (12)
    list1(Statement) -> Statement list1(Statement)     (13)
    list1(VarDecl) -> VarDecl                          (14)
    list1(VarDecl) -> VarDecl list1(VarDecl)           (15)

-----------------------------------------------------------------------------
Terminals
-----------------------------------------------------------------------------
    ';'            { TokenSemi }
    id             { TokenId $$ }
    '{'            { TokenLBrace }
    '}'            { TokenRBrace }
    public         { TokenPublickw }
    '('            { TokenLParen }
    ')'            { TokenRParen }
    int            { TokenInt $$ }
    plus           { TokenPlus }
    inttype        { TokenIntkw }
    '='            { TokenAssign }

-----------------------------------------------------------------------------
Non-terminals
-----------------------------------------------------------------------------
    %start_parser   rule  0
    MethodDecl      rule  1
    DataType        rules 2, 3
    VarDecl         rule  4
    Statement       rule  5
    Expression      rules 6, 7
    list(Statement) rules 8, 9
    list(VarDecl)   rules 10, 11
    list1(Statement) rules 12, 13
    list1(VarDecl)  rules 14, 15

-----------------------------------------------------------------------------
States
-----------------------------------------------------------------------------
State 0


    public         shift, and enter state 2

    MethodDecl     goto state 3

State 1


    public         shift, and enter state 2


State 2

    MethodDecl -> public . id '(' ')' '{' list(VarDecl) list(Statement) '}'    (rule 1)

    id             shift, and enter state 4


State 3

    %start_parser -> MethodDecl .                       (rule 0)

    %eof           accept


State 4

    MethodDecl -> public id . '(' ')' '{' list(VarDecl) list(Statement) '}'    (rule 1)

    '('            shift, and enter state 5


State 5

    MethodDecl -> public id '(' . ')' '{' list(VarDecl) list(Statement) '}'    (rule 1)

    ')'            shift, and enter state 6


State 6

    MethodDecl -> public id '(' ')' . '{' list(VarDecl) list(Statement) '}'    (rule 1)

    '{'            shift, and enter state 7


State 7

    MethodDecl -> public id '(' ')' '{' . list(VarDecl) list(Statement) '}'    (rule 1)

    id             shift, and enter state 12
            (reduce using rule 11)

    '}'            reduce using rule 11
    inttype        shift, and enter state 13

    DataType       goto state 8
    VarDecl        goto state 9
    list(VarDecl)  goto state 10
    list1(VarDecl) goto state 11

State 8

    VarDecl -> DataType . id ';'                        (rule 4)

    id             shift, and enter state 19


State 9

    list1(VarDecl) -> VarDecl .                         (rule 14)
    list1(VarDecl) -> VarDecl . list1(VarDecl)          (rule 15)

    id             shift, and enter state 12
            (reduce using rule 14)

    '}'            reduce using rule 14
    inttype        shift, and enter state 13

    DataType       goto state 8
    VarDecl        goto state 9
    list1(VarDecl) goto state 18

State 10

    MethodDecl -> public id '(' ')' '{' list(VarDecl) . list(Statement) '}'    (rule 1)

    id             shift, and enter state 17
    '}'            reduce using rule 9

    Statement      goto state 14
    list(Statement)goto state 15
    list1(Statement)goto state 16

State 11

    list(VarDecl) -> list1(VarDecl) .                   (rule 10)

    id             reduce using rule 10
    '}'            reduce using rule 10


State 12

    DataType -> id .                                    (rule 3)

    id             reduce using rule 3


State 13

    DataType -> inttype .                               (rule 2)

    id             reduce using rule 2


State 14

    list1(Statement) -> Statement .                     (rule 12)
    list1(Statement) -> Statement . list1(Statement)    (rule 13)

    id             shift, and enter state 17
    '}'            reduce using rule 12

    Statement      goto state 14
    list1(Statement)goto state 23

State 15

    MethodDecl -> public id '(' ')' '{' list(VarDecl) list(Statement) . '}'    (rule 1)

    '}'            shift, and enter state 22


State 16

    list(Statement) -> list1(Statement) .               (rule 8)

    '}'            reduce using rule 8


State 17

    Statement -> id . '=' Expression ';'                (rule 5)

    '='            shift, and enter state 21


State 18

    list1(VarDecl) -> VarDecl list1(VarDecl) .          (rule 15)

    id             reduce using rule 15
    '}'            reduce using rule 15


State 19

    VarDecl -> DataType id . ';'                        (rule 4)

    ';'            shift, and enter state 20


State 20

    VarDecl -> DataType id ';' .                        (rule 4)

    id             reduce using rule 4
    '}'            reduce using rule 4
    inttype        reduce using rule 4


State 21

    Statement -> id '=' . Expression ';'                (rule 5)

    int            shift, and enter state 25

    Expression     goto state 24

State 22

    MethodDecl -> public id '(' ')' '{' list(VarDecl) list(Statement) '}' .    (rule 1)

    %eof           reduce using rule 1


State 23

    list1(Statement) -> Statement list1(Statement) .    (rule 13)

    '}'            reduce using rule 13


State 24

    Statement -> id '=' Expression . ';'                (rule 5)
    Expression -> Expression . plus Expression          (rule 7)

    ';'            shift, and enter state 26
    plus           shift, and enter state 27


State 25

    Expression -> int .                                 (rule 6)

    ';'            reduce using rule 6
    plus           reduce using rule 6


State 26

    Statement -> id '=' Expression ';' .                (rule 5)

    id             reduce using rule 5
    '}'            reduce using rule 5


State 27

    Expression -> Expression plus . Expression          (rule 7)

    int            shift, and enter state 25

    Expression     goto state 28

State 28

    Expression -> Expression . plus Expression          (rule 7)
    Expression -> Expression plus Expression .          (rule 7)

    ';'            reduce using rule 7
    plus           reduce using rule 7


-----------------------------------------------------------------------------
Grammar Totals
-----------------------------------------------------------------------------
Number of rules: 16
Number of terminals: 11
Number of non-terminals: 10
Number of states: 29

Solution

  • I found the solution. Instead of using 2 different lists

    list(VarDecl) list(Statement)
    

    use one list

    ordered_lists(VarDecl,Statements)
    

    Following is the definition of the ordered_lists.

    --A list of p followed by a list of q, where each list can be an empty list
    ordered_lists(p,q) :
        ordered_lists1(p,q)
      { $1 }
      | -- epsilon
      { ([], []) }
    
    --This list should contain atleast one of p or q. A list of p followed by a
    --list of q, where at most one list can be an empty list
    ordered_lists1(p,q) :
        p
      { ([$1], [])  }
      | q
      { ([], [$1])  }
      | p ordered_lists1(p,q)
      { ($1:fst($2), snd($2))  }
      | q list1(q)
      { ([], $1 : $2) }
    

    Definition of the list1 is available in the question description. Please don't hesitate to post a better answer.