Search code examples
ccompiler-constructionbisonyacclex

Cannot find cause of 'syntax error' message in Bison


I'm trying to create a simple parser/compiler, mostly for homework, but eventually for learning purposes and for fun too. I've written both the lexer and the parser file (for an initial subset of commands) and I want to output an AST. However, I'm stuck at a "syntax error" message, even when I'm trying to parse a simple '1+1'. Here is the lexer file:

%{
#include "parser.tab.h"
%}

DIGIT   [0-9]
LETTER   [a-zA-Z]


%%
[ \t\n]               ;

{DIGIT}+        {yylval = atoi(yytext); return NUMBER;}

{LETTER}*       {       if (strlen(yytext) <= 8){
                                printf( "<ID, %s> ", yytext );
                        } else {
                                yytext[8] = '\0';
                                printf("WARNING! Long identifier. Truncating to 8 chars\n");
                                printf( "<ID, %s> ", yytext );
                        }
                }

"+"      {printf("Found '+' symbol\n");return(PLUS);}
"-"      return(MINUS);
"*"      return(TIMES);
"/"      return(DIVIDE);
"("      return(LEFT_PARENTHESIS);
")"      return(RIGHT_PARENTHESIS);
<<EOF>>  return(END_OF_FILE);

%%
int yywrap (void) {return 1;}

And here is the parser file:

%{
#include <stdio.h>
/*#include "tree.h"
#include "treedefs.h"*/
int yylex();
#define YYSTYPE int
%}

%start program

%token  NUMBER
%token  ID
%token  PLUS    MINUS   TIMES   EQUAL
%token  LEFT_PARENTHESIS        RIGHT_PARENTHESIS

%token  LET     IN      AND
%token  END_OF_FILE

%left   PLUS    MINUS
%left   TIMES   DIVIDE
%%

program:        /* empty */
                | exp   { printf("Result: %d\n", $1); }
                | END_OF_FILE {printf("Encountered EOF\n");}
                ;
exp:   NUMBER                  { $$ = $1;}
     | exp PLUS exp          { $$ = $1 + $3; }
     | exp TIMES exp          { $$ = $1 * $3; }
     | '(' exp ')'          { $$ = $2;}
     ;

%%

int yyerror (char *s) {fprintf (stderr, "%s\n", s);
}

Also, I've created a main.c, to keep the main() function separately. You can omit the tree*.h files as they only include functions relative to the AST.

#include <stdio.h>
#include <stdlib.h>
#include "tree.h"
#include "treedefs.h"

int main(int argc, char **argv){
        yyparse();
        TREE *RootNode = malloc(sizeof(TREE));
        return 0;
}

I've read tons of examples but I couldn't find something (VERY) different from what I wrote. What am I doing wrong? Any help, will be greatly appreciated.


Solution

  • Your grammar accepts an expression OR an end of file. So if you give it an expression followed by an end of file, you get an error.

    Another problem is that you return the token END_OF_FILE at the end of the input, rather than 0 -- bison is expecting a 0 for the EOF token and will give a syntax error if it doesn't see one at the end of the input.

    The easiest fix for both of those is to get rid of the END_OF_FILE token and have the <<EOF>> rule return 0. Then your grammar becomes:

    program:        /* empty */ { printf("Empty input\n"); }
                    | exp   { printf("Result: %d\n", $1); }
                    ;
           ...rest of the grammar
    

    Now you have the (potential) issue that your grammar only accepts a single expression. You might want to support multiple expressions separated by newlines or some other separator
    (; perhaps?), which can be done in a variety of ways.