Search code examples
schemegrammarlalrshift-reduce-conflictbigloo

How to avoid a shift reduce conflict in a LALR grammar for parsing nested lists?


I would like to create a LALR grammar to parse nested lists, but I get always a shift/reduce conflict.

I have the list1 which is a list of type1 items and list2:

<list1> ::= <type1> | <type1> <list1> ;
<type1> ::= A | B | <list2> ;

And I have a list2 which is a list of type2 items:

<list2> ::= <type2> | <type2> <list2> ;
<type2> ::= X | Y ;

This grammar produces a shift/reduce error. How can I avoid it?

This is the concrete Bigloo source:

(lalr-grammar 
  (comment
   new-line 
   text-chunk
   white-space)
  (input
   (()                '())
   ((node-list)       node-list))
  (node-list
   ((node)            (cons node '()))
   ((node node-list)  (cons node node-list)))
  (node
   ((comment)         (cons 'comment comment))
   ((new-line)        (cons 'new-line new-line))
   ((text)            (cons 'text text))
   ((white-space)     (cons 'white-space white-space)))
  (text
   ((text-chunk)      (cons 'text text-chunk))
   ((text-chunk text) (cons 'text (string-append text-chunk text)))))

The terminals are: comment, new-line, text-chunk and white-space. The non terminals are: input, node-list, node and text.

Bigloo complains about the reduce rule for text to text-chunk:

*** WARNING:bigloo:lalr-grammar
** Shift/Reduce conflict: 
 - shift to state 2
 - reduce rule (text --> text-chunk)
on token `text-chunk'

But I do not think that this is a Bigloo problem. It looks like a grammar problem.


Solution

  • The grammar is in fact ambiguous, because a sequence of type2 instances can be accumulated on list2 level, but it can also be seen as a sequence of type1 instances.

    E.g. this input

     X X
    

    can be parsed as

     list1(
       type1(
         list2(
           type2(
             X)
           list2(
             type2(
               X)))))
    

    but also as

     list1(
       type1(
         list2(
           type2(
             X)))
       list1(
         type1(
           list2(
             type2(
               X)))))
    

    What about introducing a delimiter on list1 level? This would solve the problem.