I have a lex
file, with my rules, such as:
PROGRAM return Parser::PROGRAM;
PROGRAM_END return Parser::PROGRAM_END;
VARIABLES: return Parser::VARIABLES;
INSTRUCTIONS: return Parser::INSTRUCTIONS;
SKIP return Parser::SKIP;
. {
std::cerr << lineno() << ": ERROR." << std::endl;
exit(1);
}
and when I try to use the fully compiled (with the yacc
file and etc) version then on a test file only this, last rule is used, even if the test file is correct.
For example this is a test file for these rules:
PROGRAM fst
INSTRUCTIONS:
SKIP
PROGRAM_END
For this file I only got: 1: ERROR
.
Why is this, and how can I resolve this?
As indicated in the comments, it is almost certainly the case that PROGRAM
is begin recognised as a token and passed to the parser. In almost all cases, however, the parser will immediately request another token, and the next character in the input sequence is a space, which is matched by the last rule. That rule prints an error message and calls exit()
, terminating the application. (That's not generally a good idea, but I suppose this is just a test program.) So that's all the output you'll get.
If you specify the -d
command-line argument when you invoke (f)lex, then a debugging scanner will be generated which reports the progress of the scanner as it works. That's a very easy way to see what is going on in your scanner. Bison also has a debugging mode, as explained in the bison manual. These tools are very simple to use, and come highly recommended.
Here, for example, is a quick test rig:
%{
#include <iostream>
#include <cstdlib>
class Parser {
public:
enum Token {
PROGRAM = 257,
PROGRAM_END, VARIABLES, INSTRUCTIONS, SKIP
};
};
%}
%option batch noyywrap yylineno c++
%%
PROGRAM return Parser::PROGRAM;
PROGRAM_END return Parser::PROGRAM_END;
VARIABLES: return Parser::VARIABLES;
INSTRUCTIONS: return Parser::INSTRUCTIONS;
SKIP return Parser::SKIP;
. {
std::cerr << lineno() << ": ERROR." << std::endl;
exit(1);
}
%%
int main() {
yyFlexLexer lexer{};
lexer.set_debug(1);
while(lexer.yylex() != 0) { }
return 0;
}
And a sample run:
$ g++ lex.yy.cc && ./a.out<<<"PROGRAM fst"
--(end of buffer or a NUL)
--accepting rule at line 14("PROGRAM")
--accepting rule at line 19(" ")
1: ERROR.
which makes it clear that the scanner did first produce the PROGRAM
token, before exiting on the space character.