Search code examples
bisonflex-lexer

Hello world Flex/Bison parser generates a bunch of warning messages ... how to get rid of them?


Flex/Bison newbie here.

I am trying to create a hello world lexer/parser. When I run the parser I want it to read from the command line. If I enter this:

Hello, World

then I want the parser to print out this XML:

<Document>Hello, World</Document>

Here is my lexer (helloworld.l)

%{
#include "helloworld.tab.h"
%}

%%
.+              { yylval = yytext; return(DATA); }
\n              { return(EOL); }
%%
int yywrap(){ return 1;}

Here is my parser (helloworld.y)

%{
#include <stdio.h>
%}

%token DATA
%token EOL

%%
start: /* nothing */
 | data EOL { printf("<Document>%s</Document>\n> ", $1); }
 ;

data: DATA     { $$ = $1; }
 ;
%%

int main()
{
  printf("> "); 
  yyparse();
  return 0;
}

yyerror(char *s)
{
  fprintf(stderr, "error: %s\n", s);
}

Here is my Makefile

main: helloworld.l helloworld.y
    ..\..\win_bison -d helloworld.y
    ..\..\win_flex helloworld.l
    gcc -o $@ helloworld.tab.c lex.yy.c

When I run make I get the following warning messages. What do they mean? How do I fix them?

helloworld.tab.c: In function 'yyparse':
helloworld.tab.c:576:16: warning: implicit declaration of function 'yylex' [-Wimplicit-function-declaration]
  576 | # define YYLEX yylex ()
      |                ^~~~~
helloworld.tab.c:1236:16: note: in expansion of macro 'YYLEX'
 1236 |       yychar = YYLEX;
      |                ^~~~~
helloworld.tab.c:1378:7: warning: implicit declaration of function 'yyerror'; did you mean 'yyerrok'? [-Wimplicit-function-declaration]
 1378 |       yyerror (YY_("syntax error"));
      |       ^~~~~~~
      |       yyerrok
helloworld.y: At top level:
helloworld.y:24:1: warning: return type defaults to 'int' [-Wimplicit-int]
   24 | yyerror(char *s)
      | ^~~~~~~
helloworld.l: In function 'yylex':
helloworld.l:6:10: warning: assignment to 'YYSTYPE' {aka 'int'} from 'char *' makes integer from pointer without a cast [-Wint-conversion]
    6 | .+              { yylval = yytext; return(DATA); }
      |          ^

Solution

  • It's been more than 20 years since C permitted functions without declared return types (like your yyerror), which indicates that you are using a very old example as a template. I strongly suggest that you look at the examples in the Bison manual. Although they don't show integration with flex, they will compile without complaints on modern C compilers.

    When that example was written, it was probably acceptable for functions which returned int to not be declared at all, but in this century you need to provide declarations for every function before it is used.

    The parser generated by bison expects you to define two functions, yylex and yyerror. (Typically, you use flex to generate yylex but that's not relevant to bison, since the scanner is compiled separately.) You must declare these functions in the prologue to the generated parser (where you also put the #include directives for whatever library functions your actions will require).

    For a simple parser, the declaration might be:

    %{
    #include <stdio.h>
    int yylex(void);
    void yyerror(const char* msg);
    %}
    

    Note that I changed the parameter declaration of yyerror by adding const, which would be consistent with modern style (since the argument might be a character literal, yyerror must not attempt to modify it). This would be consistent with the following implementation:

    void yyerror(const char* msg) {
      fprintf(stderr, "%s\n", msg);
    }
    

    Other than that, you must fix you scanner action because you should not rely on the value of yytext after the scanner returns. (More accurately, the value will change on the next call to yylex, but the parser is allowed to read ahead, and often does.) So you must copy yytext if you want to pass it on to the parser:

    .+      { yylval = malloc(yyleng + 1);
              strcpy(yylval, yytext);
              return(DATA); }
    

    (I wrote it out because you seem to be using Windows; normally, I would just use strdup, and if your implementation includes it I'd suggest you do that, too.)