My current implementation is this:
def find_longest_matching_option(option, options):
options = sorted(options, key=len)
longest_matching_option = None
for valid_option in options:
# Don't want to treat "oreo" as matching "o",
# match only if it's "o reo"
if re.match(ur"^{}\s+".format(valid_option), option.strip()):
longest_matching_option = valid_option
return longest_matching_option
Some examples of what I'm trying to do:
"foo bar baz something", ["foo", "foo bar", "foo bar baz"]
# -> "foo bar baz"
"foo bar bazsomething", (same as above)
# -> "foo bar"
"hello world", ["hello", "something_else"]
# -> "hello"
"a b", ["a", "a b"]
# -> "a b" # Doesn't work in current impl.
Mostly, I'm looking for efficiency here. The current implementation works, but I've been told it's O(m^2 * n)
, which is pretty bad.
Thanks in advance!
Let's start with foo
.
def foo(x, y):
x, y = x.strip(), y.strip()
return x == y or x.startswith(y + " ")
foo
returns true either if two strings are equal, or one (plus a space) is a substring of the other.
Next, given a case string, and a list of options, you can use filter
to find all valid substrings for the given case string, and then apply max
to find the longest one (see tests below).
Here's a few test cases for foo
. For the purpose of demonstrating, I'll use partial
to curry foo
to a higher order function.
from functools import partial
cases = ["foo bar baz something", "foo bar bazsomething", "hello world", "a b", "a b"]
options = [
["foo", "foo bar", "foo bar baz"],
["foo", "foo bar", "foo bar baz"],
["hello", "something_else"],
["a", "a b"],
["a", "a b\t"]
]
p_list = [partial(foo, c) for c in cases]
for p, o in zip(p_list, options):
print(max(filter(p, o), key=len))
foo bar baz
foo bar
hello
a b
a b