Search code examples
context-free-grammarambiguous-grammar

Is this context free grammar ambiguous?


I would like to know if the following context-free grammar is ambiguous or not.

S -> 0S | A | 0
A -> 1A | 1

I am fairly convinced that it is unambiguous but I was told that it was ambiguous instead. Could someone please help me?

Thank you.


Solution

  • The grammar is unambiguous.

    First, we can show that the language of the grammar is 0*(0 + 1*1); that is, the language of any number of 0s, followed either by a single 0 or by any non-empty string of 1s. Note that any such string can be obtained as follows:

    1. if the string is 0^k with k > 0: S -> 0S (k-1) times, then S -> 0 once.
    2. if the string is 0^i 1^k with i >= 0 and k > 0 then: S -> 0S i times, followed by S -> A once, then A -> 1A (k-1) times and A -> 1 once.

    Also note that the grammar cannot generate anything but such strings:

    1. all 0s come before any 1s
    2. any string without 1s must have at least one 0.

    Now the question is whether there exist different parse trees for any generated string. We can show that there are not very simply using cases:

    1. if the string is 0^k with k > 0: only two productions introduce 0s: S -> S0 and S -> 0. To get k instances of 0 we are forced to first use production S -> S0 (k-1) times and then use S -> 0 since otherwise we would terminate the intermediate form before getting to a string of length k.

    2. if the string is 0^i 1^k with i >= 0 and k > 0: we are forced to use production S -> 0 i times to get 0^i since no other sequence of productions will give us an unterminated intermediate form beginning with i 0s. Then, we are forced to use S -> A since any other choice will add extra 0s. Next, we are forced to use A -> 1A a number of times equal to (k - 1) since otherwise we'd terminate the string before getting to the required k instances of 1, since the only remaining production is A -> 1 which eliminates the last nonterminal. Finally, we are forced to use A -> 1 to terminate the string.

    So, in both cases, our choice of productions was dictated by the numbers of 0s and 1s; we never had an arbitrary choice of which production to use. In fact, since all intermediate forms only ever contained one nonterminal, we never even had a choice about what order to use productions: not only is there one parse tree for each string, but only one order in which the grammar can derive strings. There are unambiguous grammars where even this strong a condition does not hold; consider

    S -> AB
    A -> a
    B -> b
    

    This is unambiguous even though there are two derivations for the string ab:

    S -> AB -> aB -> ab
    S -> AB -> Ab -> ab
    

    In both cases, the tree is the same:

      A - a
     /
    S
     \
      B - b