I start with having this grammar
E -> E + T | T
T -> id | id[] | id[X]
X -> E, E | E
It's obvious that there's left recursion in the first rule, so I remove it to get the following
E -> TE'
E' -> +TE' | epsilon
T -> id | id[] | id[X]
X -> E, E | E
From here I produce the FIRST
and FOLLOW
sets (id
is a single terminal).
FIRST(TE') = FIRST(E, E) = FIRST(E) = {id}
FIRST(+TE') = {+}
FIRST(epsilon = {epsilon}
FIRST(id) = FIRST(id[]) = FIRST(id[X]) = {id}
FOLLOW(E) = {$}
FOLLOW(E') = {$}
FOLLOW(T) = {$}
FOLLOW(X) = {]}
But when I try to construct the parse table, I end up with multiple values for a couple of cells.
| id | + | [ | ] | , | $ |
----+-------------------+--------+----------+-----+-----+
E | TE' | | | | | |
E' | | +TE' | | | | ε |
T |id or id[] or id[X]| | | | | |
X |E, E or E | | | | | |
Am I missing something? How do rectify this so that I can parse id + id[id + id, id[]]
using this grammar?
Your grammar, even after removing left recursion, is not LL(1). These productions are problematic:
T -> id | id[] | id[X]
X -> E , E | E
Try to apply the following transformation:
A -> ab_1 | ... | ab_n ~~~~~~> A -> aA'
A' -> b_1 | ... | b_n
In your case it is:
T -> id T'
T' -> [] | [X] | epsilon
X -> EX'
X' -> , E | epsilon
But T'
still needs to apply the transformation one more time:
T -> id T'
T' -> [ T" | epsilon
T" -> ] | X ]
Full grammar:
E -> TE'
E' -> +TE' | epsilon
T -> id T'
T' -> [ T" | epsilon
T" -> ] | X ]
X -> EX'
X' -> , E | epsilon
Moreover, you should do the steps in the following way:
1) Compute FIRST for all nonterminals.
2) Compute FOLLOW for all nonterminals.
3) Compute SELECT for all productions. Keep in mind that SELECT sets for one nonterminal should be disjoint.