here is my code:
%{
#include<string.h>
#include "y.tab.h"
#define DEBUG 0
void yyerror(char* s);
void debug(char* string) {
if (DEBUG) {
printf(string);
}
}
%}
selector "selector"[0-9]+
positive "+"
negtive "-"
contains "."
before "->"
or "||"
and "&&"
delim [ /n/t]
ws {delim}*
%%
{selector} { debug("Lex:SELECTOR\n"); yylval.string = yytext; return SELECTOR; }
{positive} { debug("Lex:POSITIVE\n"); return POSITIVE; }
{negtive} { debug("Lex:NEGTIVE\n"); return NEGTIVE; }
{contains} { debug("Lex:BELONGS\n"); return CONTAINS; }
{before} { debug("Lex:BEFORE\n"); return BEFORE; }
{or} { debug("Lex:OR\n"); return OR; }
{and} { debug("Lex:AND\n"); return AND; }
{ws} ;
. { debug("Lex Parser Error"); exit(1); }
%%
.y:
%{
#include <stdio.h>
#define YYDEBUG 0
int yyparse(void);
void yyerror(char* s);
%}
%union {
char *string;
}
%token <string> SELECTOR
%token POSITIVE
%token NEGTIVE
%left CONTAINS
%left BEFORE
%token OR
%token AND
%%
logical_expr : assertion { printf("[reduce] L->A\n"); }
| logical_expr AND assertion { printf("[reduce] L->L && A\n");}
| logical_expr OR assertion { printf("[reduce] L->L || A\n"); }
;
assertion : POSITIVE prefix_relation { printf("[reduce] A->+P\n"); }
| NEGTIVE prefix_relation { printf("[reduce] A->-P\n"); }
;
prefix_relation : prefix_relation BEFORE contain_relation { printf("[reduce] P->P->C\n"); }
| contain_relation { printf("[reduce] P->C\n");;}
;
contain_relation : contain_relation CONTAINS SELECTOR { printf("[reduce] C->C.S[%s]\n", $3); }
| SELECTOR { printf("[reduce] C->S[%s]\n", $1); }
;
%%
int main()
{
return yyparse();
}
void yyerror(char* s)
{
fprintf(stderr,"%s",s);
}
int yywrap()
{
return 1;
}
my input string is: +selector1.selector2||-selector4->selector4
the parse tree of this input is expected to be:
my program generated by yacc gives output as below:
[reduce] C->S[selector1] // stack: +C
[reduce] C->C.S[selector2] // stack: +C
[reduce] P->C // stack: +P
[reduce] A->+P // stack: A
[reduce] L->A // stack: L
[reduce] C->S[selector3] // stack: L||-C
[reduce] P->C // stack: L||-P
[reduce] C->S[selector4] // stack: L||-P->C
it seems that the program stop doing shift&& reduce once cannot get any more symbol from yylex(), but i expect it to reduce the remained symbols in stack, thus L||-P->C
, so that i can generate the whole parse tree in my code.
my expected output is:
[reduce] C->S[selector1] // stack: +C
[reduce] C->C.S[selector2] // stack: +C
[reduce] P->C // stack: +P
[reduce] A->+P // stack: A
[reduce] L->A // stack: L
[reduce] C->S[selector3] // stack: L||-C
[reduce] P->C // stack: L||-P
[reduce] C->S[selector4] // stack: L||-P->C
[reduce] P->P->C // stack: L||-P
[reduce] A->-P // stack: L||A
[reduce] L->L||A // stack: L
There are a number of issues with your scanner (flex) definition.
Your default flex rule just calls exit
without any error message, (unless DEBUG
is defined and non-zero) so any lexical error will cause the program to silently halt. It would be much better to call yyerror
in that case, and produce a visible error message.
As EJP points out in a comment, your delim
definition uses /n
and /t
instead of \n
and \t
, so it will match neither a newline nor a tab. A newline won't be matched by your default rule either, so it will fall through to the flex-generated default rule, which simply prints the unmatched character to stdout
. (It is better to include %option nodefault
, which will cause flex to produce an error message if some input matches none of your rules.)
Your selector
rule sets yylval.string = yytext
. You cannot do that because yytext
points into the scanner's internal storage and the string it points to will be modified the next time yylex
is called. If you want to pass the matched token from the scanner to the parser, you must make a copy of the token, and then you need to ensure that you free
the storage allocated for it when you no longer require the string, in order to avoid leaking memory.
You need to be aware that parsers generated by bison
or yacc
generally need to read a lookahead token before performing reductions. Consequently, the last series of reductions you expect will not be performed until the scanner returns the END
token, which it will only do when it reads an end-of-file. So if you are testing your parser interactively, you will not see the final reductions until you signal an end-of-file by typing ctrl D (on Unix).
As a final note, both flex
and bison
are capable of generating debugging output which indicates which rules are matched (flex) and the series of shifts, reduces and other parser actions (bison). It's simpler and more reliable to use these features rather than trying to implement your own debugging output.