Search code examples
parsingbisonyacccontext-free-grammarlalr

Why a rule for each operation in Bison


While searching for Bison grammars, i found this example of C grammar:

https://www.lysator.liu.se/c/ANSI-C-grammar-y.html

logical_and_expression
    : inclusive_or_expression
    | logical_and_expression AND_OP inclusive_or_expression
    ;
logical_or_expression
    : logical_and_expression
    | logical_or_expression OR_OP logical_and_expression
    ;

I didn't understand the reason for a rule for each logical operation. Is there an advantage over this construction below?

binary_expression:
    : object // imagine it can be bool, int, real ...
    | binary_expression AND_OP binary_expression
    | binary_expression OR_OP binary_expression
    ;

Solution

  • The grammar you quote is unambiguous.

    The one you suggest is ambiguous, although yacc/bison allow you to use precedence rules to resolve the ambiguities.

    There are some advantages to using a grammar which makes operator precedence explicit:

    • It is a precise description of the language. Precedence rules are not part of the grammatical formalism and can be difficult to reason about. In particular, there is no general way to prove that they do what is desired.

    • The grammar is self-contained. Ambiguous grammars can only be understood with the addition of the precedence rules. This is particularly important for grammars used in language standards but it generally affects attempts to automatically build other syntax-based tools.

    • Explicit grammars are more general. Not all operator restrictions can be easily described with numeric precedence comparisons.

    • Precedence rules can hide errors in the grammar, by incorrectly resolving a shift-reduce conflict elsewhere in the grammar which happens to use some of the same tokens. Since the resolved conflicts are not reported, the grammar writer is not warned of the problem.

    On the other hand, precedence rules do have some advantages:

    • The precedence table compactly describes operator precedence, which is useful for quick reference.

    • The resulting grammar requires fewer unit productions, slightly increasing parse speed. (Usually not noticeable, but still...)

    • Some conflicts are much easier to resolve with precedence declarations, although understanding how the conflict is resolved may not be obvious. (The classic example is the dangling-else ambiguity.) Such cases have little or nothing to do with the intuitive understanding of operator precedence, so the use of precedence rules is a bit of a hack.

    The total size of the grammar is not really affected by using precedence rules. As mentioned, the precedence rules avoid the need for unit productions, but every unit production corresponds to one precedence declaration so the total number of lines is the same. There are fewer non-terminals, but non-terminals cost little; the major annoyance in yacc/bison is declaring all the semantic types, but that is easy to automate.