Search code examples
parsinggrammaryaccbisonbnf

Yacc parser grammar bug. int X; read separately as int X and ;


Problem Description

In my yacc parser grammar, I have the following rules and corresponding actions defined (see program.y below). Parsing int X; should have the derivation type => TOK_INT and variable_list => TOK_VARIABLE, and then these match against a declaration which ends in a statment ;. However, reads this as int X and ;. That is, two separate statements. Can anyone see why?

program.y

program:
    function { exit(0); }
    ;

function:
    function line { printf("goal\n"); printtree_print($2); }
        |
        ;

line:
    statement ';' { printf("line\n"); printtree_print($1); }
    ;

statement:
    declaration { printf("declaration\n"); printtree_print($1); }
    | assignment { printf("assignment\n"); printtree_print($1); }
    ;

declaration: 
       type variable_list { printf("varlist\n"); printtree_print($2); $$ = $2;  }
       ;

type:
    TOK_INT { typeMode = typeInt; }
    ;

variable_list: 
         TOK_VARIABLE
         { $$ = node_mkVariable($1, typeMode); 

        printtree_print($$);
        }
         ; 

assignment:  
      TOK_VARIABLE TOK_ASSIGN expr
      { printf("assignment %s = expr\n", $1); 
        node_setInTable($1, $3); 
    $$ = node_getFromTable($1); }
      ;

expr:
    TOK_INTEGER { $$ = node_mkConstant($1); }
| TOK_VARIABLE { $$ = node_mkVariable($1, typeVariable); }
;

Solution

  • Since 'expr' and 'assignment' are probably not germane to the problem, I omitted them from my test rig. Since you didn't provide minimal compilable code that demonstrates the problem, I created it for you:

    %{
    #include <stdlib.h>
    #include <stdio.h>
    static void yyerror(const char *str);
    static int yylex(void);
    static void printtree_print(int);
    static int node_mkVariable(int, int);
    int typeMode;
    enum { typeInt };
    %}
    %token TOK_INT
    %token TOK_VARIABLE
    %%
    program:
        function
            { exit(0); }
        ;
    
    function:
            /* Nothing */
        |   function line
            { printf("goal\n"); printtree_print($2); }
        ;
    
    line:
        statement ';'
            { printf("line\n"); printtree_print($1); }
        ;
    
    statement:
        declaration
            { printf("declaration\n"); printtree_print($1); }
        ;
    
    declaration: 
        type variable_list
            { printf("varlist\n"); printtree_print($2); $$ = $2;  }
        ;
    
    type:
        TOK_INT
             { typeMode = typeInt; }
        ;
    
    variable_list: 
        TOK_VARIABLE
        {
            $$ = node_mkVariable($1, typeMode); 
            printtree_print($$);
        }
        ; 
    %%
    void printtree_print(int n)
    {
        printf("PT_P: %d\n", n);
    }
    int yylex(void)
    {
        static int counter = 0;
        static int tokens[] = { TOK_INT, TOK_VARIABLE, ';', 0 };
        enum { NUM_TOKENS = sizeof(tokens) / sizeof(tokens[0]) };
        if (counter < NUM_TOKENS)
        {
            printf("Token: %d\n", tokens[counter]);
            return(tokens[counter++]);
        }
        return 0;
    }
    int node_mkVariable(int var, int mode)
    {
        return 23 + var + mode;
    }
    static void yyerror(const char *str)
    {
        fprintf(stderr, "Error: %s\n", str);
        exit(1);
    }
    int main(void)
    {
        while (yyparse() == 0)
            ;
        return 0;
    }
    

    When I compile it, I get as output:

    Token: 258
    Token: 259
    PT_P: 23
    varlist
    PT_P: 23
    declaration
    PT_P: 23
    Token: 59
    line
    PT_P: 23
    goal
    PT_P: 23
    Token: 0
    

    This looks correct given the infrastructure, and shows no sign of your observed behaviour. So, you need to show us just enough extra code to reproduce your problem - so as to demonstrate that it is not an artefact of the code that you didn't supply but is a feature of your grammar.

    FWIW: this was compiled on MacOS X 10.6.7 using the system provided Yacc (actually, Bison 2.3) - I got essentially the same output with 2 other variants of Yacc on my machine. The GCC was 4.2.1 (XCode 3).