I have got a NLP task which is performing fine. It assigns one entry of one dataframe to one entry of another dataframe.
Both have an ID in another column. I need to count how many times the IDs are similar, meaning it would be fine if only the first 1-3 digits are equal. The ID consists of one letter and three numbers, I splitted it in 2 columns. If the letter is wrong but the numbers are equal, it is considered useless.
def similar(a, b):
return SequenceMatcher(None, a, b).ratio()
for ID1, ID2in AB:
enumerate(similar(ID1, ID2))
I think the sequence matcher isn't the best option neither. How would I code it to first check the letter and if that is equal check the numbers as well? I would like to put it into a metric counting 0.5 true for just the letter and 1 true for both being correct
The issue after all was that I was putting the function at a wrong indention in the nested for loop.