Search code examples
listpython-2.7comparewords

Comparing two words from two lists in python


I would like to compare words that are in two different lists, so for example, I have:
['freeze','dog','difficult','answer'] and another list ['freaze','dot','dificult','anser']. I want to compare the words in this list and give marks for incorrect letters. So, +1 for being correct, and -1 for one letter wrong. To give some context, in a spelling test, the first list would be answers, and the second list would be the student's answers. How would i go about doing this?


Solution

  • Assuming the two lists are the same length and you have some function grade(a,b) where a,b are strings:

    key = ['freeze','dog','difficult','answer']
    ans = ['freaze','dot','dificult','anser']
    
    pairs = zip(key, ans)
    score = sum(grade(k,v) for (k,v) in pairs)
    

    A possible grading function would be:

    def grade(a,b):
        return 1 if a == b else -1
    

    A grading function that punishes each wrong character and gives 1pt for a correct spelling (that sounds harsh...) might be:

    def grade(a,b):
        score = sum(a == b for (a,b) in zip(a,b)) - max(len(a), len(b))
        return score if score else 1
    

    If you want the Levenshtein distance, you would probably want your grade function to be a wrapper around the following, which was found on Wikibooks and appears to be reasonably efficient:

    def levenshtein(seq1, seq2):
        oneago = None
        thisrow = range(1, len(seq2) + 1) + [0]
        for x in xrange(len(seq1)):
            twoago, oneago, thisrow = oneago, thisrow, [0] * len(seq2) + [x + 1]
            for y in xrange(len(seq2)):
                delcost = oneago[y] + 1
                addcost = thisrow[y - 1] + 1
                subcost = oneago[y - 1] + (seq1[x] != seq2[y])
                thisrow[y] = min(delcost, addcost, subcost)
        return thisrow[len(seq2) - 1]
    

    You could also take a look at difflib to do more complicated stuff.