So i'm studying compiler construction for an exam , but there seems something i can't understand.
Let's assume the following set of terminals:
T = {:, *, =, (, ), <, >, {, }, [a..z], [0..9]}
And the set of accepted tokens (l
points to a letter , and d
points to a digit):
A = { l(l|d)* , (d)+ , { (anything)* } , : , := , < , <= , <> , > , >= , ( , (* , * , *) }
Here's the state transition diagram:
Now i understand that state number 20 needs backtracking because:
(
might come , but it might be followed by a *
.:
might come , but it might be followed by a =
.<
might come , but it might be followed by a =
or >
.>
might come , but it might be followed by a =
.But what about states 3 , 5 , and 11 ? , why do we need to backtrack them ?.
It looks like "backtracking" here means you've read a character, but you haven't consumed it. So for state 3 we've consumed a l
and maybe some l
and d
, and then we get some character that is neither; that character terminates the l(l|d)*
rule, but then still needs to be processed.
Contrast that with state 7 where we read a }
character to terminate the rule, and we've used up the }
so we don't need to "backtrack."
This explanation is consistent with states 3, 5, and 20, but I don't get why 11 needs to backtrack.