I'm trying to create a simple parser/compiler, mostly for homework, but eventually for learning purposes and for fun too. I've written both the lexer and the parser file (for an initial subset of commands) and I want to output an AST. However, I'm stuck at a "syntax error" message, even when I'm trying to parse a simple '1+1'. Here is the lexer file:
%{
#include "parser.tab.h"
%}
DIGIT [0-9]
LETTER [a-zA-Z]
%%
[ \t\n] ;
{DIGIT}+ {yylval = atoi(yytext); return NUMBER;}
{LETTER}* { if (strlen(yytext) <= 8){
printf( "<ID, %s> ", yytext );
} else {
yytext[8] = '\0';
printf("WARNING! Long identifier. Truncating to 8 chars\n");
printf( "<ID, %s> ", yytext );
}
}
"+" {printf("Found '+' symbol\n");return(PLUS);}
"-" return(MINUS);
"*" return(TIMES);
"/" return(DIVIDE);
"(" return(LEFT_PARENTHESIS);
")" return(RIGHT_PARENTHESIS);
<<EOF>> return(END_OF_FILE);
%%
int yywrap (void) {return 1;}
And here is the parser file:
%{
#include <stdio.h>
/*#include "tree.h"
#include "treedefs.h"*/
int yylex();
#define YYSTYPE int
%}
%start program
%token NUMBER
%token ID
%token PLUS MINUS TIMES EQUAL
%token LEFT_PARENTHESIS RIGHT_PARENTHESIS
%token LET IN AND
%token END_OF_FILE
%left PLUS MINUS
%left TIMES DIVIDE
%%
program: /* empty */
| exp { printf("Result: %d\n", $1); }
| END_OF_FILE {printf("Encountered EOF\n");}
;
exp: NUMBER { $$ = $1;}
| exp PLUS exp { $$ = $1 + $3; }
| exp TIMES exp { $$ = $1 * $3; }
| '(' exp ')' { $$ = $2;}
;
%%
int yyerror (char *s) {fprintf (stderr, "%s\n", s);
}
Also, I've created a main.c, to keep the main() function separately. You can omit the tree*.h files as they only include functions relative to the AST.
#include <stdio.h>
#include <stdlib.h>
#include "tree.h"
#include "treedefs.h"
int main(int argc, char **argv){
yyparse();
TREE *RootNode = malloc(sizeof(TREE));
return 0;
}
I've read tons of examples but I couldn't find something (VERY) different from what I wrote. What am I doing wrong? Any help, will be greatly appreciated.
Your grammar accepts an expression OR an end of file. So if you give it an expression followed by an end of file, you get an error.
Another problem is that you return the token END_OF_FILE
at the end of the input, rather than 0
-- bison is expecting a 0
for the EOF token and will give a syntax error if it doesn't see one at the end of the input.
The easiest fix for both of those is to get rid of the END_OF_FILE
token and have the <<EOF>>
rule return 0. Then your grammar becomes:
program: /* empty */ { printf("Empty input\n"); }
| exp { printf("Result: %d\n", $1); }
;
...rest of the grammar
Now you have the (potential) issue that your grammar only accepts a single expression. You might want to support multiple expressions separated by newlines or some other separator
(;
perhaps?), which can be done in a variety of ways.