Search code examples
pythonlistfor-loopalignmentsentence

identical word alignment using for loop


I'm trying to align words from two list

sentence1 = ['boy','motorcycle','people','play']
sentence2 = ['run','boy','people','boy','play','play']

and this is my codes :

def identicalWordsIndex(self, sentence1, sentence2):
    identical_index = []
    for i in xrange(len(sentence1)):
        for j in xrange(len(sentence2)):
            if sentence1[i] == sentence2[j]:
                idenNew1 = [i,j]
                identical_index.append(idenNew1)
            if sentence2[j] == sentence1[i]:
                idenNew2 = [j,i]
                identical_index.append(idenNew2)
    return identical_index

what i'm trying to do is get the index number of align words from sentence1 and sentence2.

1st is the aligned words index from sentence1 towards sentence2. 2nd is the aligned words index from sentence2 towards sentence1.

but the result from the codes above is like this :

1st : [[0, 1], [1, 0], [0, 3], [3, 0], [2, 2], [2, 2], [3, 4], [4, 3], [3, 5], [5, 3]]
2nd : [[0, 1], [1, 0], [0, 3], [3, 0], [2, 2], [2, 2], [3, 4], [4, 3], [3, 5], [5, 3]]

what I expect from the result is like this :

1st : [[0,1],[2,2],[3,4]]
2nd : [[1,0],[2,2],[3,0],[4,3],[5,3]]

anyone can solve? thanks


Solution

  • You just need to add breaks. Try this:

    sentence1 = ['boy','motorcycle','people','play']
    sentence2 = ['run','boy','people','boy','play','play']
    identical_index = []
    
    def identicalWordsIndex( sentence1, sentence2):
        identical_index = []
        for i in xrange(len(sentence1)):
            for j in xrange(len(sentence2)):
                if sentence1[i] == sentence2[j]:
                    idenNew1 = [i,j]
                    identical_index.append(idenNew1)
                    break
        return identical_index
    
    print (identicalWordsIndex(sentence1, sentence2))
    print (identicalWordsIndex(sentence2, sentence1))
    

    Prints:

    [[0, 1], [2, 2], [3, 4]]

    [[1, 0], [2, 2], [3, 0], [4, 3], [5, 3]]