I've tried to write a basic syntax checker using bisonc++
The rules are:
expression -> OPEN_BRACKET expression CLOSE_BRACKET
expression -> expression operator expression
operator -> PLUS
operator -> MINUS
If I try to run the compiled code, I get an error at this line:
(a+b)-(c+d)
The first rule is applied, the leftmost and the rightmost brackets are the OPEN_BRACKET
and the CLOSE_BRACKET
. The remaining expression
is: a+b)-(c+d
How is it possible to prevent this behaviour? Is it possible to count the open and closed brackets?
Edit
The expression grammar:
expression:
OPEN_BRACKET expression CLOSE_BRACKET
{
//
}
| operator
{
//
}
| VARIABLE
{
//
}
;
operator:
expression PLUS expression
{
//
}
| expression MINUS expression
{
//
}
;
Edit2
The lexer
CHAR [a-z]
WS [ \t\n]
%%
{CHAR}+ return Parser::VARIABLE;
"+" return Parser::PLUS;
"-" return Parser::MINUS;
"(" return Parser::OPEN_BRACKET;
")" return Parser::CLOSE_BRACKET;
This is not a normal expression grammar. Try the normal one.
expression
: term
| expression '+' term
| expression '-' term
;
term
: factor
| term '*' factor
| term '/' factor
| term '%' factor
;
factor
: primary
| '-' factor // unary minus
| primary '^' factor // exponentiation, right-associative
;
primary
: identifier
| literal
| '(' expression ')'
;
Note also the above method of indenting and aligning, and that you only have to return yytext[0]
from the lexer for single special characters: you don't need special token names, and it's more readable without them:
CHAR [a-zA-Z]
DIGIT [0-9]
WHITESPACE [ \t\r\n]
%%
{CHAR}+ { return Parser::VARIABLE; }
{DIGIT}+ { return Parser::LITERAL; }
{WHITESPACE}+ ;
. { return yytext[0]; }