I am trying to find out where I went wrong in the below code.
Flex input:
%{
#include "jq.tab.h"
void yyerror(char *);
%}
method add|map|.. and other methods go here
%%
"/*" { return CS; }
"*/" { return CE; }
"jQuery" {
printf("%s is yytext\n", yytext);
return *yytext;
}
"args" { return ARGUMENT; }
{method} { return METHOD; }
[().\n] { return *yytext; }
[ \t]+ { return WS; }
. { return IGNORE; }
%%
int yywrap(void) {
return 1;
}
Bison input:
%{
#include <stdio.h>
int yylex(void);
void yyerror(char *);
%}
%token ARGUMENT METHOD IGNORE WS CS CE
%error-verbose
%%
stmts:
stmt '\n' { printf("A single stmt\n"); }
| stmt '\n' stmts { printf("Multi stmts\n"); }
;
stmt:
jQuerycall { printf("A complete call ends here\n"); }
| ignorechars { printf("Ignoring\n"); }
| ignorechars WS jQuerycall { printf("ignore+js\n"); }
| jQuerycall WS ignorechars { printf("js+ignore\n"); }
| optionalws stmt optionalws
| CS stmt CE { printf("comment\n"); }
;
jQuerycall:
'jQuery' '(' ARGUMENT ')' '.' methodchain { printf("args n methodchain\n"); }
| 'jQuery' '(' ')' '.' methodchain { printf("methodchain\n"); }
| 'jQuery' '(' ARGUMENT ')' { printf("args\n"); }
| 'jQuery' '(' ')' { printf("empty call\n"); }
;
methodchain:
methodchain '.' methodcall
| methodcall
;
methodcall:
METHOD '(' ')'
;
ignorechars:
IGNORE
| IGNORE optionalws ignorechars
;
optionalws:
| WS
;
%%
void yyerror(char *s) {
fprintf(stderr, "%s\n", s);
}
int main(void) {
yyparse();
return 0;
}
My aim is to recognize any jQuery call with all its elements, and ignore any other statement / string. Also ignore comments. Right now, this code makes many assumptions - like 'args' to be the only selector element inside jQuery().
I am using the following input-output cases. Cases like 10 and 12 are the ones I'm trying to figure out:
> 1.input: statement\n output: Ignoring
>
> 2.input: statement statement\n output: Ignoring
>
> 3.input: statement statement statement\n output: Ignoring
>
> 4.input: jQuery()\n output: jQuery is yytext empty call A complete call ends here
>
> 5.input: jQuery(args)\n output: jQuery is yytext args A complete call ends here
>
> 6.input: jQuery().add()\n output: jQuery is yytext methodchain A complete call ends here
>
> 7.input: jQuery(args).add().map()\n output: jQuery is yytext args n methodchain A complete call ends here
>
> 8.input: /*comment*/\n output: Ignoring comment
>
> 9.input: /*jQuery()*/\n output: jQuery is yytext empty call A complete call ends here comment
>
> 10.input: /* comment */\n output: syntax error, unexpected CE, expecting IGNORE
>
> 11.input: var a = b\n output: Ignoring
>
> 12.input: var a = jQuery(args)\n output: jQuery is yytext syntax error, unexpected 'jQuery', expecting IGNORE
In your lex file, the rule:
"jQuery" {
printf("%s is yytext\n", yytext);
return *yytext;
}
returns the token 'j'
when it sees an input string of jQuery. Since your bison file never does anything with the token 'j'
this will generally give you a syntax error.
You need to add JQUERY
to your %token
declaration and have this lex rule return that.
edit
Usually a comment can appear anywhere in the program (between any two other tokens) and are completely ignored. So the easiest way to deal with them is in the lexer:
%x comment
%%
"/*" { BEGIN comment; }
<comment>. ;
<comment>"*/" { BEGIN 0; }
this will skip over comments (returning no tokens at all), so the grammar doesn't need to worry about them. If you don't want to use a lexer start state, you could instead use the complex regex:
"/*"([^*]|\*+[^*/])*\*+"/" ;