Search code examples
bisonglr

Additional syntax error message on GLR parser when syntax is ambiguous


I am using Bison 2.7 to write a GLR parser and also turn on %error-verbose option. When I ran the parser, it gave me "syntax is ambiguous" error. Is there a way that Bison can give me more details on where/how the syntax is ambiguous?


Solution

  • If you are looking to produce meaningful error messages, you will probably need to craft your own function to report ambiguities. bison does give you a basic tool for this: the custom %merge function.

    As indicated in the manual (see the paragraphs at the end of the link above), you can specify a custom merge function using %merge clauses on productions which can lead to ambiguities. A merge function can do pretty well anything, including signaling errors, but there are some limitations:

    1. The arguments to the merge function are semantic values (YYSTYPE), which will be of the subtype of the ambiguous production. The semantic actions of the ambiguous right-hand sides will have been executed before the merge function is called, so if the semantic actions mutate global parser state, that state might be inconsistent and the merge function will have to clean up. (Consequently, it is recommended that semantic actions in such a grammar not modify global state.) The merge function must return a semantic value of the appropriate type.

    2. If possibly ambiguous productions have different types, you will probably need to create a different merge function for each type, since is no way to know which element of a semantic union is appropriate for a specific union value unless you code that into the YYSTYPE itself. A simple way of doing that is to create a "discriminated union" by including an enumerated tag as the first element of every union member. (C/C++ allow you to access such a tag using any union member, but it must be at the beginning for this to work.)

    3. The merge function is called with exactly two semantic values. So if there are more than two possible parses, it will be called multiple times.

    4. The merge function does not have access to location information, unless that information is included in the semantic type (which rather defeats the point of having a separate location stack IMHO, but of course you only need to include the location information in types which might participate in an ambiguity.)