Search code examples
cparsingbisonyaccparser-generator

yacc loses values among reduction


I'm working on this grammar to build a SDD for type checking, or similar. I spent yesterday working out for data structures and parsing action, but I always reached a segmentation fault. It seems to me that YACC(bison) is loosing values among reduction.

Thus I decided to build a simpler grammar with simpler actions. It seems that the values are lost among one reduction and another, or maybe am I doing something wrong? The lexer part is not relevant in this example so I omitted it.

Following the grammar with its action and the result vs expected result..

D:   T VAR SEMICOLON D              {
                                    printf("processing D -> T var ; D\n");
                                    printf("\tvalue of T is %f\n", $1);
                                }
|/*empty*/                      {
                                    printf("processing D -> empty\n");
                                }
;

T:  B                               {
                                    printf("processing B inside T\n");
                                    printf("\tvalue of B is %f\n", $1);
                                } 

C                               {   printf("processing C inside T\n");
                                    printf("processing T-> B C\n");
                                    printf("\tvalue of B is %f\n", $1);
                                    printf("\tvalue of C is %f\n", $<dbl>2);
                                    $$ = $1 + $<dbl>2;

                                }
| RECORD '{' D '}'              {   printf("processing record { D }\n");}
;

B:   INT                            {   printf("processing B -> int\n");
                                    $$ = 1;
                                }
| FLOAT                         {   printf("processing B -> float\n");
                                    $$ = 1;
                                }
;

C:  /*empty*/                       {   printf("processing C -> empty\n");
                                    printf("\tsetting C to be equal to 1\n");
                                    $$=1;
                                }
| LBRACK NUM RBRACK C           {   int n = $2;
                                    printf("processing C -> [%d] C\n", n);
                                    double d = $4;
                                    printf("\tprevious C value is %f\n", d);
                                    double f = d+ 1;
                                    printf("\tnew value of $$ is %f\n", f);
                                    $$ = f;
                                }
;

this is the output for an input like int [12][3] ciao;

processing B -> int
processing B inside T
    value of B is 1.000000
processing C -> empty
    setting C to be equal to 1
processing C -> [3] C
    previous C value is 1.000000
    new value of $$ is 2.000000           
processing C -> [12] C                    
    previous C value is 2.000000          
    new value of $$ is 3.000000           
processing C inside T
processing T-> B C
    value of B is 1.000000
    value of C is 0.000000                (*)
processing D -> empty
processing D -> T var ; D
    value of T is 1.000000                (*)

As you can see the value are lost among C reductions marked with * I expect it to grew up, like the following

processing B -> int
processing B inside T
    value of B is 1.000000
processing C -> empty
    setting C to be equal to 1
processing C -> [3] C
    previous C value is 1.000000
    new value of $$ is 2.000000           
processing C -> [12] C                    
    previous C value is 2.000000          
    new value of $$ is 3.000000           
processing C inside T
processing T-> B C
    value of B is 1.000000
    value of C is 3.000000                
processing D -> empty
processing D -> T var ; D
    value of T is 4.000000                

ANY hint is appreciated as well as explanation and suggestion to reach the scope, is there anything that I am missing?


Solution

  • The production for T is as follows, simplified considerably.:

    T: B { /* Mid Rule Action (MRA) */ } C { $$ = $1 + $2; }
    

    In the final action for T, the $2 refers to the MRA, because MRAs are counted in the production's terms. (In fact, an MRA is replaced with a non-terminal with an empty RHS.) So C is $3.

    Since the MRA does not actually set a value, $2 is somewhat unspecified, but 0 is not too unlikely.

    Bison manual references:

    Using Mid-Rule Actions:

    The mid-rule action itself counts as one of the components of the rule. This makes a difference when there is another action later in the same rule (and usually there is another at the end): you have to count the actions along with the symbols when working out which number n to use in $n.

    Mid-Rule Action Translation: points out that "mid-rule actions are actually transformed into regular rules and actions", and then provides a number of examples of the empty rules produced (and their internal names, useful for understanding bison debugging output.)