I would like to find common string between: strings_list = ['PS1 123456 Test', 'PS1 758922 Test', 'PS1 978242 Test']
The following code returns only the first part "PS1 1", I would imagine the result is "PS1 Test". Could you help me, is it possible to obtain using SequenceMatcher? Thank you in advance!
def findCommonStr(strings_list: list) -> str:
common_str = strings_list[0]
for i in range(1, n):
match = SequenceMatcher(None, common_str, strings_list[i]).get_matching_blocks()[0]
common_str = common_str[match.b: match.b + match.size]
common_str = common_str.strip()
return common_str
This is without SequenceMatcher approach. If all strings follow the same pattern, you can split them into words on whitespaces.
strings_list = ['PS1 123456 Test', 'PS1 758922 Test', 'PS1 978242 Test']
test = []
for string in strings_list:
print(string.split())
test.append(string.split())
>>> ['PS1', '123456', 'Test']
['PS1', '758922', 'Test']
['PS1', '978242', 'Test']
Now you can simply do a set intersection to find the common elements. Reference: Python -Intersection of multiple lists?
set(test[0]).intersection(*test[1:])
>>> {'PS1', 'Test'}
# join them to get string
' '.join(set(test[0]).intersection(*test[1:]))
>>> PS1 Test
This would only work if they follow this pattern of separated by white space.
Function:
def findCommonStr(strings_list: list) -> str:
all_str = []
for string in strings_list:
all_str.append(string.split())
return ' '.join(set(all_str[0]).intersection(*all_str[1:]))