Search code examples
flex-lexerlex

Processing character returned by yyless within a start condition in yacc


For the code snippet below, the "ASSN: =" block for {EQ} is not triggered for an input of "CC=gcc\n" - I don't understand why this is, the equals character is being passed, as it is being processed by the next rule for {CHAR}.

How can I ensure that the {EQ} rule for is processed when the equals character is 'pushed' back by yyless?

The byacc code is pretty much empty with a single dummy rule, but with the relevant %token lines.

#define _XOPEN_SOURCE 700
#include <stdio.h>
#include "y.tab.h"
extern YYSTYPE yylval;
%}

%x      ASSIGNMENT
%option noyywrap

DIGIT   [0-9]
ALPHA   [A-Za-z]
SPACE   [ ]
TAB     [\t]
WS      [ \t]+
NEWLINE (\n|\r|\r\n)
IDENT   [A-Za-z_][A-Za-z_0-9]+
EQ      =
CHAR    [^\r\n]+

%%

<*>"#"{CHAR}{NEWLINE}

({IDENT}{EQ})|({IDENT}{WS}{EQ}) { 
                        yylval.strval = strndup(yytext, 
                                strlen(yytext)-1);

                        printf("NORM: %s\n", yylval.strval);
                        yyless(strlen(yytext)-1);
                        BEGIN(ASSIGNMENT);
                        return TOK_IDENT;
                    }

<ASSIGNMENT>{

{EQ}                {
                        printf("ASSN: =\n");
                        return TOK_ASSIGN;
                    }

{CHAR}              {
                        printf("ASSN: %s\n", yytext);
                        return TOK_STRING;
                    }

{NEWLINE}           {   
                        BEGIN(INITIAL); 
                    }
}

{WS}
{NEWLINE}

.                   { 
                        printf("DOT : %s\n", yytext);
                    }

<*><<EOF>>          { 
                        printf("EOF\n");
                        return 0;       
                    }

%%

int main()
{
    printf("Start\n\n");
    int ret;
    while( (ret = yylex()) )    {
        printf("LEX : %u\n", ret);
    }
    printf("\nEnd\n");
}

Example output:

Start

NORM: CC
LEX : 257
ASSN: =gcc
LEX : 259
EOF

End

Solution

  • My issue was that flex matches the longest rule first, so {CHAR} was always winning over {EQ}. I solved this by introducing another Start Condition to consume the {EQ}{WS}? before passing to