Search code examples
clemon

Parser reducing too early


I've got a grammar that basically looks like :

start ::= groups.
groups ::= groups group.
groups ::= group.
group(A) ::= IDENTIFIER identparams CURLY_OPEN assignments CURLY_CLOSE SEMICOLON.
group(A) ::= IDENTIFIER CURLY_OPEN assignments CURLY_CLOSE SEMICOLON.
assignments ::= assignments assignment.
assignment ::= IDENTIFIER ASSIGNMENT bool_expr SEMICOLON.

It parses something like :

name {
     name = "value";
     name2 = "value";
};

That's a named like config indeed. What happens is that :

name = "value" results in assignments ::= assignments assignment. beeing reduced. I would expect that assignments was a constant value, but that isn't the case :

P assignment(0x807e778) ::= IDENTIFIER(0x807e728) ASSIGNMENT mvalue SEMICOLON.

P assignments((nil)) ::= assignments((nil)) assignment(0x807e778).
P append 0x807e778 to 0x807e838
P mvalue ::= string.
P assignment(0x807e750) ::= IDENTIFIER(0x807e7c8) ASSIGNMENT mvalue SEMICOLON.
P assignments((nil)) ::= assignments(0x807e838) assignment(0x807e750).
P append 0x807e750 to 0x807e910
P mvalue ::= string.
P assignment(0x807e7f0) ::= IDENTIFIER(0x807e7a0) ASSIGNMENT mvalue SEMICOLON.
P assignments((nil)) ::= assignments(0x807e910) assignment(0x807e7f0).
P append 0x807e7f0 to 0x807e9e8
P group(0x807e7a0) assignments(0x807e9e8) : bind
P groups ::= group(0x807e7a0).

The lemon debug output is http://pastebin.com/yHNkNRpf

which results in name2 being only added to the list. I'm puzzled by this. I understand the reduce, but not why assignents keep being set to null. There are ways around this, but I prefer a proper fix.

The lemon debug output is http://pastebin.com/yHNkNRpf

Any clues ?


Solution

  • Better late then never : The re2c lexer used didn't had all token paths covered. In other words : Some tokens didn't match a rule, and a default wasn't set. In that case, the behaviour is undefined. In this case, it returned the wrong token.