Trying to generate all sentences of a simple formal grammar

I am new to python and trying to generate all sentences possible in the grammar. Here is the grammar:

  #set of non terminals
  N = ('<subject>', '<predicate>', '<noun phrase>', '<noun>', '<article>', '<verb>',   '<direct object>')
  #set of teminals

  T = ('the', 'boy', 'dog', 'bit')
  #productions
  P = [ ('Sigma',           ['<subject>', '<predicate>']), \
  ('<subject>',       ['<noun phrase>']),            \
  ('<predicate>',     ['<verb>']),                   \
  ('<predicate>',     ['<verb>','<direct object>']), \
  ('<noun phrase>',   ['<article>','<noun>']),       \
  ('<direct object>', ['<noun phrase>']),            \
  ('<noun>',          ['boy']),                      \
  ('<noun>',          ['dog']),                      \
  ('<article>',       ['the']),                      \
  ('<verb>',          ['bit'])                       ]

Here is my attempt, I am using a queue class to implement it methodically,

# language defined by the previous grammar.
Q = Queue()
Q.enqueue(['Sigma'])
found = 0
while 0 < len(Q):
    print "One while loop done"
    # Get the next sentential form
    sf = Q.dequeue()
    sf1 = [y for y in sf]
    for production in P:
        for i in range(len(sf1)):
                if production[0] == sf1[i]:
                        sf[i:i+1] = [x for x in production[1]]
                        Q.enqueue(sf)
                        Q.printQ()

I am getting in infinite loop, and also I am facing some issue with shallow-deep copy, if I change one copy of sf, everything in queue changes too. Any help is appreciated, any directions, tips would be great

Here is the expected output:

       The dog bit the boy
       The boy bit the dog
       The boy bit the boy
       The dog bit the dog
       The dog bit
       The boy bit

Solution

I am facing some issue with shallow-deep copy, if I change one copy of sf, everything in queue changes too

Yes. In Python, a list is an object with its own identity. So:

Q.enqueue(['Sigma'])

creates a (one-element) list and enqueues a reference to it.

sf = Q.dequeue()

pops that reference from Q and assigns it to variable 'sf'.

sf[i:i+1] = ...

makes a change to that list (the one that 'sf' refers to).

Q.enqueue(sf)

enqueues a reference to that same list.

So there's only one list object involved, and Q just contains multiple references to it.

Instead, you presumably want each entry in Q to be a reference to a separate list (sentential form), so you have to create a new list for each call to Q.enqueue.

Depending on how you fix that, there might or might not be other problems in the code. Consider:

(1) Each sentence has multiple derivations, and you only need to 'find' one (e.g., the leftmost derivation).

(2) In general, though not in your example grammar, a production's RHS might have more than one occurrence of a non-terminal (e.g. if COND then STMT else STMT), and those occurrences need not derive the same sub-forms.

(3) In general, a grammar can generate an infinite set of sentences.

By the way, to copy a list in Python, instead of saying

copy = [x for x in original]

it's simpler to say:

copy = original[:]