Search code examples
pythonpython-3.xany

Fast way to find if list of words contains at least one word that starts with certain letters (not "find ALL words"!)


I have set (not list) of strings (words). It is a big one. (It's ripped out of images with openCV and tesseract so there's no reliable way to predict its contents.)

At some point of working with this list I need to find out if it contains at least one word that begins with part I'm currently processing. So it's like (NOT an actual code):

if exists(word.startswith(word_part) in word_set) then continue else break

There is a very good answer on how to find all strings in list that start with something here:

result = [s for s in string_list if s.startswith(lookup)]

or

result = filter(lambda s: s.startswith(lookup), string_list)

But they return list or iterator of all strings found. I only need to find if any such string exists within set, not get them all. Performance-wise it seems kinda stupid to get list, then get its len and see if it's more than zero and then just drop that list.

It there a better / faster / cleaner way?


Solution

  • Your pseudocode is very close to real code!

    if any(word.startswith(word_part) for word in word_set):
        continue
    else:
        break
    

    any returns as soon as it finds one true element, so it's efficient.