Search code examples
flex-lexer

rule cannot be matched when using flex "||"


If I delete the line 41 then that was no warning

"lex.l", line 41: warning, rule cannot be matched

line 41: "||" {printf("26,\"%s\"\n",yytext);}

digit [0-9]
letter [A-Za-z]
id ({letter}|[_])({letter}|{digit}|[_])*
%%
[ |\t|\n]+
"var" {printf("28,\"%s\"\n",yytext);}
"if" {printf("29,\"%s\"\n",yytext);}
"then" {printf("30,\"%s\"\n",yytext);}
"else" {printf("31,\"%s\"\n",yytext);}
"while" {printf("32,\"%s\"\n",yytext);}
"for" {printf("33,\"%s\"\n",yytext);}
"begin" {printf("34,\"%s\"\n",yytext);}
"writeln" {printf("35,\"%s\"\n",yytext);}
"procedure" {printf("36,\"%s\"\n",yytext);}
"end" {printf("37,\"%s\"\n",yytext);}
{id} {printf("1,\"%s\"\n",yytext);}
{digit}+ {printf("2,\"%s\"\n",yytext);}

...

"+=" {printf("23,\"%s\"\n",yytext);}
"-=" {printf("24,\"%s\"\n",yytext);}
"==" {printf("25,\"%s\"\n",yytext);}
"||" {printf("26,\"%s\"\n",yytext);}
"&&" {printf("27,\"%s\"\n",yytext);}
%%
#include <ctype.h>
int main(){
    yylex ( );
    return 0 ;
}
yywrap(){
    return 1;
}

Solution

  • This:

    [ |\t|\n]
    

    Is a character class, which matches one of the following four characters:

    • space
    • Vertical bar (|)
    • tab
    • newline

    The vertical bar appears twice in the class, but since a character class is a set, the repetition is ignored.

    So

    [ |\t|\n]+
    

    matches any non-empty sequence consisting only of the above characters. One such sequence is ||. Since that rule comes before the "||" rule, it will be used to match || and thus the rule "||" cannot ever be matched, as the warning says.

    You should seriously consider using [[:space:]] to match any whitespace character, [[:alpha:]] to match a letter and [[:digit:]] to natch a digit. Those are more self-documenting than trying to write out the set. But if you are going to write out the set, don't include a vertical bar unless you mean to include it.

    Flex patterns are documented in the flex manual. It's worth reading.