Search code examples
djangopython-3.xfuzzy-logic

Trying to convert Excel Fuzzy logic to Python function


I currently use the following fuzzy logic command in Excel to select a value from a reference table: =IF(E49>0,VLOOKUP(E49,'Ref Table'!$D$4:$E$22,2,FALSE),"--")

I am trying to write a Django/Python function that will give the value closest to the numbers provided. (Example: Score = 14.5 - Value returned is 0.021)

I installed fuzzywuzzy but I am not sure it is the best way to implement this.

Below is an example of the function so far without the Fuzzy Logic.

@register.simple_tag
def get_perc(score):
    if score is None:
        return '--'
    else:
        pct_dict = {
            14: 0.016,
            14.7: 0.021,
            15.3: 0.026,
            16: 0.034,
            16.7: 0.04,
            17.3: 0.05,
            18: 0.07,
            18.7: 0.09,
            19.3: 0.11,
            20: 0.13,
            20.7: 0.17,
            21.3: 0.21,
            22: 0.26,
            22.7: 0.31,
            23.3: 0.38,
            24: 0.47,
            24.7: 0.56,
            25.3: 0.68,
            26: 0.82,
            26.7: 0.98,
            27.3: 1.17,
            28: 1.39,
            29.3: 1.94,
            30: 2.28
        }
    if score in pct_dict.keys():
        return pct_dict[score]
    else:
        return '--'

(Example: Score = 14.5 - Value returned is 0.021)


Solution

  • If you are simply trying to fuzzy match the input to one of your keys, using fuzzywuzzy, you could try something like this:

    from fuzzywuzzy import process
    
    def get_perc(score):
        # I put your dictionary up here so that it's always defined.
        pct_dict = {
            14: 0.016,
            14.7: 0.021,
            15.3: 0.026,
            16: 0.034,
            16.7: 0.04,
            17.3: 0.05,
            18: 0.07,
            18.7: 0.09,
            19.3: 0.11,
            20: 0.13,
            20.7: 0.17,
            21.3: 0.21,
            22: 0.26,
            22.7: 0.31,
            23.3: 0.38,
            24: 0.47,
            24.7: 0.56,
            25.3: 0.68,
            26: 0.82,
            26.7: 0.98,
            27.3: 1.17,
            28: 1.39,
            29.3: 1.94,
            30: 2.28
        }
        MATCH_THRESHOLD = 80 # This is the minimum score needed to "match" a value
                             #  you can change it as you like.
    
        if not score:   # I changed this, so that any "falsey" value will return '--'
                        #   this includes values like '', None, 0, and False
            return '--'
    
        match, match_score = process.extractOne(score, pct_dict.keys())
    
        if match_score >= MATCH_THRESHOLD:
            return pct_dict[match]
        else:
            return '--'
    

    I made a few changes to your original code, with explanation in comments.

    I've never used fuzzywuzzy, but based this on the "Usage" section of the fuzzywuzzy README: https://github.com/seatgeek/fuzzywuzzy