Search code examples
grammarcontext-free-grammarlexical-analysisformal-languages

Tips for creating "Context Free Grammar"


I am new to CFG's,
Can someone give me tips in creating CFG that generates some language

For example

L = {am bn | m >= n}

What I got is:

So -> a | aSo | aS1 | e
S1 -> b | bS1 | e

but I think this area is wrong, because there is a chance that the number of b's can be greater than a's.


Solution

  • How to write CFG with example ambn

    L = {am bn | m >= n}.

    Language description: am bn consist of a followed by b where number of a are equal or more then number of b.

    some example strings: {^, a, aa, aab, aabb, aaaab, ab......}

    So there is always one a for one b but extra a are possible. infect string can be consist of a only. Also notice ^ null is a member of language because in ^ NumberOf(a) = NumberOf(b) = 0

    How to write a grammar that accepts the language formed by strings am bn?

    In the grammar, there should be rules such that if you add a b symbol you also add a a symbol.

    and this can be done with something like:

       S --> aSb 
    

    But this is incomplete because we need a rule to generate extra as:

       A --> aA | a
    

    Combine two production rules into a single grammar CFG.

       S --> aSb | A
       A --> aA  | a
    

    So you can generate any string that consist of a also a and b in (am bn) pattern.

    But in above grammar there is no way to generate ^ string.

    So, change this grammar like this:

       S --> B   | ^
       B --> aBb | A
       A --> aA  | a
    

    this grammar can generate {am bn | m >= n} language.

    Note: to generate ^ null string, I added an extra first step in grammar by adding S--> B | ^, So you can either add ^ or your string of symbol a and b. (now B plays role of S from previous grammar to generate equal numbers of a and b)

    Edit: Thanks to @Andy Hayden
    You can also write equivalent grammar for same language {am bn | m >= n}:

       S --> aSb | A
       A --> aA | ^
    

    notice: here A --> aA | ^ can generate zero or any number of a. And that should be preferable to my grammar because it generates a smaller parse tree for the same string.
    (smaller in height preferable because of efficient parsing)

    The following tips may be helpful to write Grammar for a formal language:

    • You are to be clear about language that what it describes (meaning/pattern).
    • You can remember solutions for some basic problems(the idea being that you can write new grammars).
    • You can write rules for fundamental languages like I have written for RE in this example to write Right-Linear-Grammmar. The rules will help you to write Grammar for New Languages.
    • One different approach is to first draw automata, then convert automata to Grammar. We have predefined techniques to write grammar from automata from any class of formal language.
    • Like a Good Programmer who learns by reading the code of others, similarly one can learn to write grammars for formal languages.

    Also the grammar you have written is wrong.