Search code examples
parsingcompiler-constructiongrammar

Parser errors - pattern for generating error handling automatically


Is there any known way to implement good error handling for machine generated parsers? Does a "pattern" or a known algorithm exist for this kind of problem?

For "good" I mean something which resembles results obtainable with hand crafted recursive descent parsers and modern compilers: Parser does not stop at first error, can be made to emit "meaningful" errors and not just "unrecognized token in line xyz" one error at a time.

Ideally this approach should be automated as well, not handcrafted.

I am not searching for a library, I need an approach, which can be used in different platforms and ideally would be as language independent as possible.


Solution

  • With a traditional YACC/bison generator you get the yyerror/YYERROR framework, with which it is not easy to generate very useful error messages, due to the unordered backtracking nature of LALR parsers. You can even add error recovery rules there, because you need might need them to suppress wrong error messages in failed rules, where you only wanted to cut parsing rules.

    With a PEG-based parser you got the much better ~{} postfix error action block syntax to work with. See eg. the peg manual.

      rule = e1 e2 e3 ~{ error("e[12] ok; e3 has failed"); }
             | ...
    
      rule = (e1 e2 e3) ~{ error("one of e[123] has failed"); }
             | ...
    

    You get excellent error messages at the actual place of the error. But you have to write PEG rules, which are not so easy to write, esp. when handling operator precedence. This is easier with a LALR parser.

    With a simplier recursive descent parser generator you got the same error reporting advantages of PEG, but with a much slower parse speed.

    See the same discussion at http://lambda-the-ultimate.org/node/4781