Search code examples
compiler-constructionll-grammar

What is the rationale for the third rule of the follow set algorithm?


For a compilers class, we were presented with the following rules for finding the FOLLOW(A):

α

  1. If A is the start symbol, add '$' to FOLLOW(A)
  2. If A -> α B β, add FIRST(β) - {ε} to FOLLOW(B)
  3. If A -> α B, or ( A -> α B β and β *-> ), add FOLLOW(A) to FOLLOW(B).

We were also given the informal definition of the Follow(A): The set of terminals which can immediately follow A in a sentential form.

Why is the third rule true?

I understand how to apply the third rule, but I'm confused about why it must be the case. Could anyone provide a concrete example that requires its use, or an example which would fail the informal definition in absence of the third rule?


Solution

  • Imagine you have a production rule

    A → αBβ

    and you know that β can derive the empty string. In that case, a character that can legally follow the A nonterminal can also follow the B nonterminal if you used the above production rule and then expanded β to the empty string.

    As an example, let’s look at this simple grammar:

    S → Ax

    A → CBC

    C → ε

    B → y

    Here, we can do this derivation that puts an x after B:

    S → Ax → CBCx → CBx