Search code examples
algorithmll-grammar

Generating First Set from the grammar


Algorithm for Finding first set:

Given a grammar with the rules A1 → w1, ..., An → wn, we can compute the Fi(wi) and Fi(Ai) for every rule as follows:

    initialize every Fi(Ai) with the empty set
    set Fi(wi) to Fi(wi) for every rule Ai → wi, where Fi is defined as follows:
        Fi(a w' ) = { a } for every terminal a
        Fi(A w' ) = Fi(A) for every nonterminal A with ε not in Fi(A)
        Fi(A w' ) = Fi(A) \ { ε } ∪ Fi(w' ) for every nonterminal A with ε in Fi(A)
        Fi(ε) = { ε }
    add Fi(wi) to Fi(Ai) for every rule Ai → wi
    do steps 2 and 3 until all Fi sets stay the same.

Grammar:

A -> B C c
A -> g D B
B -> EPSILON | b C D E
C -> D a B | c a
D -> EPSILON | d D
E -> g A f | c 

This website generates the first set as follows:

Non-Terminal Symbol     First Set

A                        g, ε, b, a, c, d
B                        ε, b
C                        a, c, ε, d
D                        ε, d
E                        g, c

But the algorithm says Fi(A w' ) = Fi(A) for every nonterminal A with ε not in Fi(A) so the First(A) according to this algorithm should not contain ε. First(A) = {g, b, a, c, d}.

Q: First(A) for the above grammar is = First(B) - ε U First(C) U {g} ?

This video also follows the above algorithm and do not choose ε.


Solution

  • First(B) = {ε, b}
    First(D) = {ε, d}
    First(E) = {g, c}
    First(C) = {c, d, a}
    First(A) = {b, g, c, d, a}
    

    Example:

    X -> Y a | b
    Y -> c | ε
    
    First(X) = {c, a, b}
    First(Y) = {c, ε}
    

    First(X) doesn't have ε because if you replace Y by ε, then First(Y a) is equal to First(ε a) = {a}

    First set implementation on my github.

    Edit: Updated link
    https://github.com/amirbawab/EasyCC-CPP/blob/master/src/syntax/grammar/Grammar.cpp#L229
    Computing the first and follow sets are both available on the new link above.