Search code examples
parsingbisonnullablerulesreduce-reduce-conflict

Bison eliminate reduce/reduce conflict among nullable non-terminals?


I am using Bison (AFAIK they use LL(1) parsing as default).

My grammar says something like this:

function_decl: ID '(' params ')' ':' TYPE ... // body may go here
function_call: ID '(' arguments ')'

params: ID ':' TYPE
     | params ',' ID ':' TYPE
     | %empty

arguments: ID
    | arguments ',' ID
    | %empty

Now, bison warns saying a reduce/reduce conflict, because params and arguments both are null-able (in case of a zero-parameter function()).

My question is, how can I remove (instead of suppressing) this conflict ?

Although someone suggested to use different parsing technique, I want to make it clear for myself if it is possible (and I should do so) or I should just ignore it.


Solution

  • If you ignore the warning, you will end up with a parser which cannot recognise a function call without arguments. So that is probably not a good idea.

    You are quite correct that the conflict is the result of both params and arguments producing an empty string. Because the parser can only lookahead at one symbol a when it reads the ) in func(), it needs to decide whether to reduce an empty params (which will force it to proceed with function_decl) or an empty arguments (which will commit it to function_call). But there is no way to know until the next token is read.

    The easiest solution is to avoid the empty reductions, although it makes the grammar slightly wordier. In the following, neither params nor arguments produce the empty string, and extra productions for function_decl and function_call are used to recognise these cases.

    function_decl: ID '(' params ')' ':' TYPE ... // body may go here
    function_decl: ID '(' ')' ':' TYPE ... 
    function_call: ID '(' arguments ')'
    function_call: ID '(' ')'
    
    params: ID ':' TYPE
         | params ',' ID ':' TYPE
    
    arguments: ID
        | arguments ',' ID
    

    This works because it lets the parser defer the decision between call and declaration. An LR parser can defer decisions until it has to reduce; it can keep several productions open at the same time, but it must reduce a production when it reaches its end.


    Note that even without the conflict, your original grammar is incorrect. As written, it allows arguments (or params) to start with an arbitrary number of commas. Your intention was not to allow %empty as an alternative base case, but rather as an alternative complete expansion. The correct grammar for an optional comma-separated list requires two non-terminals:

    arguments
        : %empty
        | argument_list
    argument_list
        : argument
        | argument_list ',' argument