Search code examples

Flex action for quoted string returns empty string

I am trying to get working an example shown in the Flex manual [1]. The example shows Flex rules for a quoted string that may contain octal codes.

The manual is a bit incomplete in its description of the action for the closing quote. It simply has this comment:

/* return string constant token type and
*  value to parser

So I created code that I thought would work, but apparently my code is incorrect.

Below is the lexer followed by the parser. When I execute the generated parser, I get this output:

The string is: ''

What I expect, and want, is this output:

The string is: 'John Doe'

My input is this: "John Doe"

What am I doing wrong, please?

Here is the lexer:

%option noyywrap
%x STR
#include ""
#define MAX_STR_CONST 100
    char string_buf[MAX_STR_CONST];
    char *string_buf_ptr;
\"            { string_buf_ptr = string_buf; BEGIN(STR); }
    \"          { /* closing quote - all done */
                   *string_buf_ptr = '\0';
                   yylval.strval = strdup(string_buf_ptr);
    \n          {  /* error - unterminated string constant */
                   perror("Error - unterminated string");
    \\[0-7]{1,3} { /* octal escape sequence */
                   int result;
                   (void) sscanf(yytext+1, "%o", &result);
                   if (result > 0xff) {
                      perror("Error - octal escape is out-of-bounds");
                   *string_buf_ptr++ = result;
    \\[0-9]+    { /* bad escape sequence */
                   perror("Error - bad escape sequence");
    \\n         *string_buf_ptr++ = '\n';
    \\t         *string_buf_ptr++ = '\t';
    \\r         *string_buf_ptr++ = '\r';
    \\b         *string_buf_ptr++ = '\b';
    \\f         *string_buf_ptr++ = '\f';
    \\(.|\n)    *string_buf_ptr++ = yytext[1];
    [^\\\n\"]+  {
                   char *yptr = yytext; 
                   while (*yptr)
                      *string_buf_ptr++ = *yptr++;

Here is the parser:

#include <stdio.h>
#include <stdlib.h>
/* interface to the lexer */
extern int yylineno; /* from lexer */
int yylex(void);
void yyerror(const char *s, ...);
extern FILE *yyin;
int yyparse (void);
%union {
   char *strval;
%token <strval> STRING
    : STRING       { printf("The string is: '%s'", $1);}

int main(int argc, char *argv[])
    yyin = fopen(argv[1], "r");
    return 0;

void yyerror(const char *s, ...)
  fprintf(stderr, "%d: %s\n", yylineno, s);

[1] See page 24-25 in the Flex manual


  • Your action is:

    *string_buf_ptr = '\0';
    yylval.strval = strdup(string_buf_ptr)
    return STRING;

    It seems pretty clear that strdup of string_buf_ptr will return a newly-allocated copy of an empty string, since you just set the character pointed to by string_buf_ptr to 0.

    Two comments:

    • This bug has essentially nothing to do with Flex (or Bison). I know that it is always tempting to assume that the most unfamiliar technology you are using is the source of errors, but making assumptions like that is not a very effective debugging technique.
    • A debugger is often a faster way of finding bugs than StackOverflow. There's a bit of a learning curve to use Gdb, but it will definitely pay off in the end (perhaps even soon).

    Also, perror is intended to present the user with an error message based on the value of errno. That's not very useful in this context; you probably want to call yyerror. (However, you'll need to declare it in the lexer, unless you arrange for its prototype to be inserted in See %code requires/%code provides in the bison manual for how to do that.)