I am using Treetop to create a grammar for the Verilog language, and have come across some cases where the language specification involves a left recursive construct which does not translate to Treetop.
I have done some reading on this, and this answer gives a good summary of a generic way to eliminate left recursion: Left recursion elimination
However, I cannot wrap my head around how that actually works and would appreciate if someone more knowledgeable could confirm whether my approach here is correct...
For this original rule, which includes left recursion (the comment is how it is written in the language spec):
# constant_expression ::=
# constant_primary
# | unary_operator { attribute_instance } constant_primary
# | constant_expression binary_operator { attribute_instance } constant_expression
# | constant_expression ? { attribute_instance } constant_expression : constant_expression
rule constant_expression
constant_primary /
(unary_operator (s attribute_instance)* s constant_primary) /
(constant_expression s binary_operator (s attribute_instance)* s constant_expression) /
(constant_expression s "?" (s attribute_instance)* s constant_expression s ":" s constant_expression)
end
Is the following really equivalent with the left recursion removed?
rule constant_expression
(constant_primary constant_expression_tail?) /
(unary_operator (s attribute_instance)* s constant_primary constant_expression_tail?)
end
rule constant_expression_tail
(s binary_operator (s attribute_instance)* s constant_expression constant_expression_tail?) /
(s "?" (s attribute_instance)* s constant_expression s ":" s constant_expression constant_expression_tail?)
end
That seems to make sense and do the same thing. One thing that might help in understanding is to rewrite like the code below. One thing to remember about PEG grammars is that if a rule fails to match, it will try to match the next alternation.
rule constant_expression
(constant_primary / (unary_operator (s attribute_instance)* s constant_primary)) constant_expression_tail?
end
rule constant_expression_tail
((s binary_operator (s attribute_instance)* s) /
(s "?" (s attribute_instance)* s constant_expression s ":" s)) constant_expression constant_expression_tail?
end