Search code examples
cregexyacclex

Output from yacc is delayed


I have the following lex and yacc codes in bas.l and bas.y. I am trying to build a simple calculator that contains only addition(+), subtraction(-), multiplication(*) and division(/). I wanted it so that each expression containing more than two numbers(but separated by same operation)can be evaluated and the answer would be printed out immediately. But I get my output delayed. Here are the codes:

bas.y

%{
#include <stdio.h>
int yylex(void);
void yyerror(char *);
%}

%token INTEGER

%%
program:
    program statement {printf("Answer : %d\n", $2);}
    |
    ;
statement:
    INTEGER {$$=$1;}
    |statement '+' INTEGER {$$ = $1 + $3;}
    |statement '-' INTEGER {$$ = $1 - $3;}
    |statement '*' INTEGER {$$ = $1 * $3;}
    |statement '/' INTEGER {$$ = $1 / $3;}
    ;

%%

void yyerror(char *s) {
fprintf(stderr, "%s\n", s);
}

int main(void){
    yyparse();
    return 0;
}

bas.l

%{
#include "y.tab.h"
#include <stdlib.h>
void yyerror(char *);
%}

%%
[0-9]+ {yylval=atoi(yytext);return INTEGER;}
[-+*/] {return *yytext;}
[ \n]
%%
int yywrap(void) {
return 1;
}

This was the input I gave:

54+1
25-54

Output:

Answer : 55
Answer : -29

But the output is delayed. When I give the first line 54+1, I expect an immediate output of Answer : 55. But I don't get it, instead the scanner is waiting for my input. When I enter the second input 25-54, that's when I get the output Answer : 55, but I don't get the second output. After this, when I interrupted with Control + D (I am on Mac OS), I get the second output Answer : -29.

What I expect is the output to get printed immediately after I give the input. I am absolutely new to lex and yacc so all help is much appreciated. Please also explain what was happening internally so that I know why such behaviour is occurring. You can assume that I know theoretical part of parsing, the bottom-up parsing, reduce/shift operations in building a syntax tree etc..


Solution

  • You lexer ignores white characters. It's fine in case of spaces but parser doesn't get any information about the Enter key either. In the result, you get a parser that is entirely white-character-agnostic. For example, you can split a single statement into multiple lines and it will still work.

    This is not inherently bad but the parser has no way of knowing that the previous statement has ended and the next has started until it sees at least part of the next statement (or EOF). After all, you could write:

    4 * 3
    + 5
    

    The correct result is 17 and it would be incorrect to return 12 after parsing just the first line.


    To fix it you either have to introduce some character that marks the end of a statement (e.g. ;) or stop ignoring \n in your grammar.

    I'll go with the second solution.

    Add a new token ENTER to your parser and change the rule program so it only ends a statement when it sees it:

    %token ENTER
    
    program:
        program statement ENTER {printf("Answer : %d\n", $2);}
        |
        ;
    

    In your lexer, stop ignoring \n and return the token ENTER instead:

    " "    {/* do nothing */}
    \n     {return ENTER;}