Flex/Bison newbie here.
I am trying to create a hello world lexer/parser. When I run the parser I want it to read from the command line. If I enter this:
Hello, World
then I want the parser to print out this XML:
<Document>Hello, World</Document>
Here is my lexer (helloworld.l)
%{
#include "helloworld.tab.h"
%}
%%
.+ { yylval = yytext; return(DATA); }
\n { return(EOL); }
%%
int yywrap(){ return 1;}
Here is my parser (helloworld.y)
%{
#include <stdio.h>
%}
%token DATA
%token EOL
%%
start: /* nothing */
| data EOL { printf("<Document>%s</Document>\n> ", $1); }
;
data: DATA { $$ = $1; }
;
%%
int main()
{
printf("> ");
yyparse();
return 0;
}
yyerror(char *s)
{
fprintf(stderr, "error: %s\n", s);
}
Here is my Makefile
main: helloworld.l helloworld.y
..\..\win_bison -d helloworld.y
..\..\win_flex helloworld.l
gcc -o $@ helloworld.tab.c lex.yy.c
When I run make I get the following warning messages. What do they mean? How do I fix them?
helloworld.tab.c: In function 'yyparse':
helloworld.tab.c:576:16: warning: implicit declaration of function 'yylex' [-Wimplicit-function-declaration]
576 | # define YYLEX yylex ()
| ^~~~~
helloworld.tab.c:1236:16: note: in expansion of macro 'YYLEX'
1236 | yychar = YYLEX;
| ^~~~~
helloworld.tab.c:1378:7: warning: implicit declaration of function 'yyerror'; did you mean 'yyerrok'? [-Wimplicit-function-declaration]
1378 | yyerror (YY_("syntax error"));
| ^~~~~~~
| yyerrok
helloworld.y: At top level:
helloworld.y:24:1: warning: return type defaults to 'int' [-Wimplicit-int]
24 | yyerror(char *s)
| ^~~~~~~
helloworld.l: In function 'yylex':
helloworld.l:6:10: warning: assignment to 'YYSTYPE' {aka 'int'} from 'char *' makes integer from pointer without a cast [-Wint-conversion]
6 | .+ { yylval = yytext; return(DATA); }
| ^
It's been more than 20 years since C permitted functions without declared return types (like your yyerror
), which indicates that you are using a very old example as a template. I strongly suggest that you look at the examples in the Bison manual. Although they don't show integration with flex, they will compile without complaints on modern C compilers.
When that example was written, it was probably acceptable for functions which returned int
to not be declared at all, but in this century you need to provide declarations for every function before it is used.
The parser generated by bison expects you to define two functions, yylex
and yyerror
. (Typically, you use flex
to generate yylex
but that's not relevant to bison, since the scanner is compiled separately.) You must declare these functions in the prologue to the generated parser (where you also put the #include
directives for whatever library functions your actions will require).
For a simple parser, the declaration might be:
%{
#include <stdio.h>
int yylex(void);
void yyerror(const char* msg);
%}
Note that I changed the parameter declaration of yyerror
by adding const
, which would be consistent with modern style (since the argument might be a character literal, yyerror
must not attempt to modify it). This would be consistent with the following implementation:
void yyerror(const char* msg) {
fprintf(stderr, "%s\n", msg);
}
Other than that, you must fix you scanner action because you should not rely on the value of yytext
after the scanner returns. (More accurately, the value will change on the next call to yylex
, but the parser is allowed to read ahead, and often does.) So you must copy yytext
if you want to pass it on to the parser:
.+ { yylval = malloc(yyleng + 1);
strcpy(yylval, yytext);
return(DATA); }
(I wrote it out because you seem to be using Windows; normally, I would just use strdup
, and if your implementation includes it I'd suggest you do that, too.)