Search code examples
pythonstringdictionarycomparisonstring-comparison

Find whole part of list item, not subparts, in a string?


I have a dictionary of keys & values (massively truncated for ease of reading):

responsePolarities = {'yes':0.95, 'hell yes':0.99, 'no':-0.95, 'hell no':-0.99, 'okay':0.70}

I am doing a check to see if any key is in a string passed to my function:

for key, value in responsePolarities.items():
    if key in string:
        return value

Problem is that if, in the passed string, a word such as "know" is in it, the function sees the 'no' in 'know' and returns a -0.95.

I can't add spaces around the 'no' key because it could be the only response provided.

How can I make the function see 'no' as 'no' but not 'know'? Am I correct in thinking this is probably going to need to be a RegExp job, or is there something more simple I'm missing?

I thought about splitting my passed string into individual words, but then I couldn't check for multi-word phrases that modify the response polarity (like no vs. hell no)...


Solution

  • If I understand this correctly, you want to match text that contains your keys, but only if the whole word matches. You can do this using the regex word boundary delimiter \b. It will match when the word is separated by punctuation, like :no, but not other word characters like know. Here you loop through some strings and for each find the matching keys in the dictionary:

    responsePolarities = {'yes':0.95, 'hell yes':0.99, 'no':-0.95, 'hell no':-0.99, 'okay':0.70}
    
    strings = [
        'I know nothing',
        'I now think the answer is no',
        'hell, mayb yes',
        'or hell yes',
        'i thought:yes or maybe--hell yes--'
    ]
    
    for s in strings:
        for k,v in responsePolarities.items():
            if re.search(rf"\b{k}\b", s):
                print(f"'{s}' matches: {k} : {v}")
    

    'I know nothing' shouldn't match anything. The matches should look like:

    'I now think the answer is no' matches: no : -0.95
    'hell, mayb yes' matches: yes : 0.95
    'or hell yes' matches: yes : 0.95
    'or hell yes' matches: hell yes : 0.99
    'i thought:yes or maybe--hell yes--' matches: yes : 0.95
    'i thought:yes or maybe--hell yes--' matches: hell yes : 0.99

    If you are doing a lot of searches, you might consider precompiling the regexes before the loop.