I am trying to get working an example shown in the Flex manual [1]. The example shows Flex rules for a quoted string that may contain octal codes.
The manual is a bit incomplete in its description of the action for the closing quote. It simply has this comment:
/* return string constant token type and
* value to parser
*/
So I created code that I thought would work, but apparently my code is incorrect.
Below is the lexer followed by the parser. When I execute the generated parser, I get this output:
The string is: ''
What I expect, and want, is this output:
The string is: 'John Doe'
My input is this: "John Doe"
What am I doing wrong, please?
Here is the lexer:
%option noyywrap
%x STR
%{
#include "parse.tab.h"
#define MAX_STR_CONST 100
%}
%%
char string_buf[MAX_STR_CONST];
char *string_buf_ptr;
\" { string_buf_ptr = string_buf; BEGIN(STR); }
<STR>{
\" { /* closing quote - all done */
BEGIN(INITIAL);
*string_buf_ptr = '\0';
yylval.strval = strdup(string_buf_ptr);
return(STRING);
}
\n { /* error - unterminated string constant */
perror("Error - unterminated string");
yyterminate();
}
\\[0-7]{1,3} { /* octal escape sequence */
int result;
(void) sscanf(yytext+1, "%o", &result);
if (result > 0xff) {
perror("Error - octal escape is out-of-bounds");
yyterminate();
}
*string_buf_ptr++ = result;
}
\\[0-9]+ { /* bad escape sequence */
perror("Error - bad escape sequence");
yyterminate();
}
\\n *string_buf_ptr++ = '\n';
\\t *string_buf_ptr++ = '\t';
\\r *string_buf_ptr++ = '\r';
\\b *string_buf_ptr++ = '\b';
\\f *string_buf_ptr++ = '\f';
\\(.|\n) *string_buf_ptr++ = yytext[1];
[^\\\n\"]+ {
char *yptr = yytext;
while (*yptr)
*string_buf_ptr++ = *yptr++;
}
}
%%
Here is the parser:
%{
#include <stdio.h>
#include <stdlib.h>
/* interface to the lexer */
extern int yylineno; /* from lexer */
int yylex(void);
void yyerror(const char *s, ...);
extern FILE *yyin;
int yyparse (void);
%}
%union {
char *strval;
}
%token <strval> STRING
%%
start
: STRING { printf("The string is: '%s'", $1);}
;
%%
int main(int argc, char *argv[])
{
yyin = fopen(argv[1], "r");
yyparse();
fclose(yyin);
return 0;
}
void yyerror(const char *s, ...)
{
fprintf(stderr, "%d: %s\n", yylineno, s);
}
[1] See page 24-25 in the Flex manual https://epaperpress.com/lexandyacc/download/flex.pdf
Your action is:
*string_buf_ptr = '\0';
yylval.strval = strdup(string_buf_ptr)
return STRING;
It seems pretty clear that strdup
of string_buf_ptr
will return a newly-allocated copy of an empty string, since you just set the character pointed to by string_buf_ptr
to 0.
Two comments:
Also, perror
is intended to present the user with an error message based on the value of errno
. That's not very useful in this context; you probably want to call yyerror
. (However, you'll need to declare it in the lexer, unless you arrange for its prototype to be inserted in parse.tab.h
. See %code requires
/%code provides
in the bison manual for how to do that.)