I want to find FIRST and FOLLOW for the following CFG:
S -> O | M
M -> iEtMeM | a
O -> iEtS | iEtMeO
E -> b
I did the left factorization, so I get:
S -> O | M
M -> iEtMeM | a
O -> iEtO'
O'-> S | MeO
E -> b
The FIRST sets:
FIRST(S) = FIRST(O)|FIRST(M) = {i,a}
FIRST(M) = {i,a}
FIRST(O) = {i}
FIRST(O') = FIRST(S)|FIRST(M) = {i,a}
FIRST(E) = {b}
And now I cant find the FOLLOW set for S
, because I don't know what FOLLOW(O')
is:
FOLLOW(S) = {$, FOLLOW(O')}
Actually FOLLOW(S) = {$}
only.
So, I overlooked that S
is mentioned on a right-hand
side. Corrections below:
First we get the augmented grammar by adding the production S' ->S$
, then
FOLLOW(S') = {$}
.
Then we have
from S' -> S$
and O' -> S
FOLLOW(S) = FIRST($) + FOLLOW(O')
from M -> iEtMeM
, O' -> MeO
, and S -> M
FOLLOW(M) = FIRST(eM) + FIRST(eO) + FOLLOW(S)
from S -> O
and O' -> MeO
FOLLOW(O) = FOLLOW(S) + FOLLOW(O')
from O -> iEtO'
FOLLOW(O') = FOLLOW(O)
from M -> iEtMeM
and O -> iEtO'
FOLLOW(E) = FIRST(tMeM) + FIRST(tO')
The "problem" are the mutually recursive definitions for
FOLLOW(S)
, FOLLOW(O)
, and `FOLLOW(O') - that means that each
of these follow sets is a subset of the others, hence they are
equal.
If one represent the set inclusion constraints, imposed by the above equations, as a graph (with non-terminal symbols as nodes), each set of mutually recursive definitions forms a strongly-connected component. Replacing each SCC with a new node will result in a DAG, representing a set of non-recursive equations, which can then by "evaluated" in topological order.
Say, we replace nodes, corresponding to symbols S
, O
and O'
with node N
. The equations become:
FOLLOW(N) = FIRST($) + FOLLOW(N)
FOLLOW(M) = FIRST(eM) + FIRST(eO) + FOLLOW(N)
FOLLOW(N) = FOLLOW(N) + FOLLOW(N)
FOLLOW(N) = FOLLOW(N)
FOLLOW(E) = FIRST(tMeM) + FIRST(tO')
and by cutting off the redundant parts:
FOLLOW(N) = FIRST($) = {$}
FOLLOW(M) = FIRST(eM) + FIRST(eO) + FOLLOW(N) = {e, $}
FOLLOW(E) = FIRST(tMeM) + FIRST(tO') = {t}
and, since N
stands for either S
, O
, or O'
we get:
FOLLOW(S`) = FOLLOW(S) = FOLLOW(O) = FOLLOW(O`) = {$}
FOLLOW(M) = {e, $}
FOLLOW(E) = {t}