Search code examples
flex-lexerlex

is there a way of looping pattern action pair in lex program?


I was wondering if there is a means of looping in rules section of a lex program, where i can iterate both pattern and action.

something like this:

%{
  char *pattern[] = {a,b,c,d,e}
%}

%%
 for(i=0,i<5,i++){
   (pattern[1]){action[i]}
 }
%%

//Some functions

Is it possible to make such kind of iteration?

I am in looking for a way to write a lex progam that can identify all C language keywords.


Solution

  • I'm not sure exactly how looping will help you solve this problem. (F)lex already loops, repeatedly finding a token until some action returns (or EOF is reached and the default EOF action returns).

    To identify keywords, just write the keywords out as patterns:

    %{
      int keywords = 0;
    %}   
    %option noyywrap
    %%
     /* List of keywords taken from http://port70.net/~nsz/c/c11/n1570.html#6.4.1 */
    auto                    { ++keywords; }
    break                   { ++keywords; }
    case                    { ++keywords; }
    char                    { ++keywords; }
    const                   { ++keywords; }
    continue                { ++keywords; }
    default                 { ++keywords; }
    do                      { ++keywords; }
     /* Etc. */
    [[:alpha:]_][[:alnum:]_]*                ; /* Ignore other identifiers */
    \.?[[:digit:]]([[:alnum:].]|[EePp][+-])* ; /* And pp-numbers */
     /* The next one matches too much; it will cause keywords inside comments
      * and quoted strings to be accepted. So you still have some work to do. */
    [^[:alnum:]]+                            ; /* And everything else */
    
    %%
    int main(void) {
      yylex();
      printf("%d keywords found.\n", keywords);
      return 0;
    }
    

    If you need to distinguish between the keywords, you'll need to do something more sophisticated. But a good text editor should let you convert the list of keywords into any simple repeated action, such as

    auto     { return TOKEN_auto; }