Search code examples
pythonpython-3.xfuzzy-search

Check for word in string with unpredictable delimiters


I am looking for something slightly more reliable for unpredictable strings than just checking if "word" in "check for word".

To paint an example, lets say I have the following sentence:

"Learning Python!"

If the sentence contains "Python", I'd want to evaluate to true, but what if it were:

"Learning #python!"

Doing a split with a space as a delimiter would give me ["learning", "#python"] which does not match python.

(Note: While I do understand that I could remove the # for this particular case, the problem with this is that 1. I am tagging programming languages and don't want to strip out the # in C#, and 2. This is just an example case, there's a lot of different ways I could see human typed titles including these hints that I'd still like to catch.)

I'd basically like to inspect if beyond reasonable doubt, the sequence of characters I'm looking for is there, despite any weird ways they might mention it. What are some ways to do this? I have looked at fuzzy search a bit, but I haven't seen any use-cases of looking for single words.

The end goal here is that I have tags of programming languages, and I'd like to take in the titles of people's stream titles and tag the language if its mentioned in the title.


Solution

  • This code prints True if the word contains ‘python’, ignoring case.

    import re
    
    input = "Learning Python!"
    print(re.search("python", input, re.IGNORECASE) is not None)