Search code examples
pythonpipenltkcontext-free-grammarchomsky-normal-form

How to deal with some ambiguous context free grammar productions in Python


I am trying to use CNF grammar by feeding nltk.cfg with a bunch of grammar productions like:

 NN -> 'rubble' | 'slope' | 'Jake'
 VP -> V NP | VP PP 

But it has problem (gives the error: Expected an arrow) with the productions which have pipes on the left-hand side of the production. Example:

VP | <VBP-SBAR> -> VBP SBAR

Does nltk have any grammar-method which doesn't have problem with pipes on the left-hand side?

If not, How can I change all those productions to usable productions like the first group? Example:

VP  -> VBP SBAR    
<VBP-SBAR> -> VBP SBAR

Solution

  • A production rule with multiple options on the left-hand side of the production is no longer a Context Free Grammar - there must be only one nonterminal on the LHS of every rule.

    Well, it doesn't really make any sense, in the first place, if you can split the rule

    VP | <VBP-SBAR> -> VBP SBAR
    

    into two rules

    VP -> VBP SBAR
    <VBP-SBAR> -> VBP SBAR