Search code examples
bisonflex-lexer

Bison print character which caused an error


I'm developing a simple compiler with Bison and Flex. My grammar only accepts strings [a-z]+. If the input is a0 I would like to display an error message, something along the lines of 'unrecognized character 0'. For now, every non-grammar things are intercepted by . in the end. So string a0 is causing syntax error. I did set Bison parse.error to verbose but it gives me tokens that are useless for the user. Is there a way that I could display what I want, i.e. show the user what exactly is wrong with the input?


Solution

  • This is easiest to do in the lexer rather than the parser. If your lexer (.l file) ends with a rule like:

    .    { yyerror("ignoring unrecognized character '%c' in input", *yytext); }
    

    then the bogus characters will simply be skipped (and the message printed). Alternately, you can return them as (single) tokens to the parser

    .    { return *yytext; }
    

    and then if you set %define parser.error verbose in your bison code you'll get syntax errors that display the unexpected token and which token(s) were expected.