Search code examples
ccompiler-constructionflex-lexer

How do I define a pattern that will equate to all tokens not recognized by the scanner in Flex?


I am trying to define a pattern in Flex that will throw an error when it reads a token that isn't already defined. I tried this:

DIGIT       [0-9]
INT         -?[0-9][0-9]*
DOUBLE      {INT}"."({DIGIT})*
CHAR        [A-Za-z]
CHAR_       [A-Za-z_]
ID          {CHAR}({CHAR_}|{DIGIT})*
HEX         (0X | 0x)[a-fA-F0-9][a-fA-F0-9]*
STRINGLIT   \"(\\.|[^"])*\"
ERRSTRING   \"(\\.|[^"])*
UNRECCHAR   [^("+"|"-"|"*"|"/"|"%"|"<"|">"|"="|"!"|";"|","|"."|"["|"]"|"{"|"}"|{CHAR_}|{DIGIT})]

%%

"+"           {return '+';}
"-"           {return '-';}
"*"           {return '*';}
"/"           {return '/';}
"%"           {return '%';}
"<"           {return '<';}
">"           {return '>';}
"="           {return '=';}
"!"           {return '!';}
";"           {return ';';}
","           {return ',';}
"."           {return '.';}
"["           {return '[';}
"]"           {return ']';}
"("           {return '(';}
")"           {return ')';}
"{"           {return '{';}
"}"           {return '}';}

{UNRECCHAR} {
            ReportError::UnrecogChar(&yyloc, yytext);
            }

and

.           {
            ReportError::UnrecogChar(&yyloc, yytext);
            }

Both will not compile. I get an error saying

scanner.l: unrecognized rule

Why is this happening?

NOTE: The error handling is done by a separate c file.


Solution

  • . has been working for me for thirty years. You must have done something else wrong.

    But I suggest:

    . return yytext[0];
    

    That will match any character not already matched by a prior rule, except a newline, and return it to the parser to deal with, with the parser's error recovery.

    NB you should handle unary minus in the parser, not the scanner.