Search code examples
flex-lexerlex

Lex: expected expression before ‘[’ token when writing regular expressions


I'm new to Lex/Yacc and am following this tutorial: Part 01: Tutorial on Lex/Yacc

Here's my Lex file:

%{
    #include "main.h"
    #include <stdio.h>
%}

%%
    [a-zA-Z][_a-zA-Z0-9]*   {return IDENTIFIER;}
    "&"                     {return RUN_DAEMON;}
    "|"                     {return SYM_PIPE;}
    ">"                     {return RED_STDOUT;}
    "<"                     {return RED_STDIN;}
    ">>"                    {return APP_STDOUT;}
    [ \t\n]+                {;}
    .                       {printf("unexpected character\n");}
%%

int yywrap(){
    return 1;
}

However, after running the lex command, I try to compile lex.yy.c with gcc, but it spams me with this error:

sbash.l: In function ‘yylex’:
sbash.l:7:5: error: expected expression before ‘[’ token
     [a-zA-Z][_a-zA-Z0-9]*   {return IDENTIFIER;}
     ^
sbash.l:7:6: error: ‘a’ undeclared (first use in this function)
     [a-zA-Z][_a-zA-Z0-9]*   {return IDENTIFIER;}
      ^
sbash.l:7:6: note: each undeclared identifier is reported only once for each function it appears in
sbash.l:7:14: error: ‘_a’ undeclared (first use in this function)
     [a-zA-Z][_a-zA-Z0-9]*   {return IDENTIFIER;}
              ^~
sbash.l:7:17: error: ‘zA’ undeclared (first use in this function)
     [a-zA-Z][_a-zA-Z0-9]*   {return IDENTIFIER;}
                 ^~
sbash.l:7:20: error: ‘Z0’ undeclared (first use in this function)
     [a-zA-Z][_a-zA-Z0-9]*   {return IDENTIFIER;}
                    ^~
sbash.l:7:29: error: expected expression before ‘{’ token
     [a-zA-Z][_a-zA-Z0-9]*   {return IDENTIFIER;}
                             ^
sbash.l:13:7: error: stray ‘\’ in program
     [ \t\n]+                {;}
       ^
sbash.l:13:9: error: stray ‘\’ in program
     [ \t\n]+                {;}

Unfortunately, I cannot find what's going wrong even when googled (many example's expression writes exactly the same as my code).

My Lex version is 2.6.1 and is on CentOS 8.


Solution

  • As explained in the Flex manual chapter on flex input file format, pattern rules must start at the left margin:

    The rules section of the flex input contains a series of rules of the form:

      pattern   action 
    

    where the pattern must be unindented and the action must begin on the same line. (Some emphasis added)

    Indented lines on the rules section are just passed through as-is. In particular, indented lines prior to the first rule are inserted at the top of the yylex function, which is frequently useful. But flex makes no attempt to verify that code included in this way is valid; errors will be detected when the generated scanner is compiled.