Search code examples
parsingcompiler-constructioncontext-free-grammar

Compilers, finding FIRST set for grammar


I am reading the famous violet dragon book 2nd edition, and can't get the example from page 65, about creating the FIRST set:

We have the following grammar (terminals are bolded):

stmtexpr;
| if ( expr ) stmt
| for ( optexpr ; optexpr ; optexpr ) stmt
| other

optexpr → ε
| expr

And the book suggests that the following is the correct calculation of FIRST:

FIRST(stmt) → {expr, if, for, other} // agree on this

FIRST(expr ;) → {expr} // Where does this come from?

As the comment suggests, from where does the second line come from?


Solution

  • There is no error in the textbook.

    The FIRST function is defined (on page 64, emphasis added):

    Let α be a string of grammar symbols (terminals and/or nonterminals). We define FIRST(α) to be the set of terminals that appear as the first symbols of one or more strings of terminals generated from α.

    In this example, expr ; is a string of grammar symbols consisting of two terminals, so it is a possible value of α. Since it includes no nonterminals, it cannot only generate itself; thus the only string of terminals generated from that value of α is precisely expr ;, and the only terminal that will appear in FIRST(α) is the first symbol in that string, expr.

    This may all seem to be belabouring the obvious, but it leads up to the important note just under the example you cite:

    The FIRST sets must be considered if there are two productions A → α and A → β. Ignoring ε-productions for the moment, predictive parsing requires FIRST(α) and FIRST(β) to be disjoint.

    Since expr ; is one of the possible right-hand sides for stmt, we need to compute its FIRST set (even though the computation in this case is trivial) in order to test this prerequisite.