Search code examples
pythonedit-distance

Add counter and distance to dictionary


Hello I have a specific string and I am trying to calculate its distance using edit distance and I want to see the number of counts of the string that occurs and then sort it.

str= "Hello"

and a txt file named- xfile I am comparing with is:

"hola"
"how are you"
"what is up"
"everything good?"
"hola"
"everything good?"
"what is up?"
"okay"
"not cool"
"not cool"

I want to make a dictionary that compares all the lines with the xfile and give it's edit distance and count. For now, I am able to get it's key and distance, but not it's count. Can someone please suggest me it?

My code is:

data= "Hello"

Utterences = {}

for lines in readFile:
    dist= editdistance.eval(data,lines)
    Utterances[lines]= dist

Solution

  • For every utterance you can have a dictionary containing the distance and count:

    import editdistance
    
    data = 'Hello'
    
    utterances = {}
    
    xlist = [
        'hola',
        'how are you',
        'what is up',
        'everything good?',
        'hola',
        'everything good?',
        'what is up?',
        'okay',
        'not cool',
        'not cool',
    ]
    
    for line in xlist:
        if line not in utterances:
            utterances[line] = {
                'distance': editdistance.eval(data, line),
                'count': 1
            }
        else:
            utterances[line]['count'] += 1
    

    Then if you need the utterances sorted by distance or count you can use an OrderedDict:

    from collections import OrderedDict
    
    sorted_by_distance = OrderedDict(sorted(utterances.items(), key=lambda t: t[1]['distance']))
    sorted_by_count = OrderedDict(sorted(utterances.items(), key=lambda t: t[1]['count']))