Search code examples
flex-lexeryacclex

Need Lex regular expression to match string upto newline


I want to parse strings of the type :

a=some value
b=some other value

There are no blanks around '=' and values extend up to newline. There may be leading spaces.

My lex specification (relevant part) is:

%%  
a=  { printf("Found attr %s\n", yytext); return aATTR; }
^[ \r\t]+   { printf("Found space at the start %s\n", yytext); }
([^a-z]=).*$  { printf("Found value %s\n", yytext); }
\n  { return NEWLINE; }
%%  

I tried .*$ [^\n]* and a few other regular expressions but to no avail. This looks pretty simple. Any suggestions? I am also aware that lex returns the longest match so that complicates it further. I get the whole line matched for some regular expressions I tried.


Solution

  • You probably want to incorporate separate start states. These permit you to encode simple contexts. The simple example below captures your id, operator and value on each call to yylex().

    %{
    char id;
    char op;
    char *value;
    %}
    
    %x VAL OP
    %%
    <INITIAL>[a-z]+ {
        id = yytext[0];
        yyleng = 0;
        BEGIN OP;
    }
    <INITIAL,OP>[ \t]*
    <OP>=[ \t]* {
        op = yytext[0];
        yyleng = 0;
        BEGIN VAL;
    }
    <VAL>.*\n {
        value = yytext;
        BEGIN INITIAL;
        return 1;
    }
    %%