New to Python/Spyder and have a scenario where I have an existing list (we can call it Model). What I'm trying to do is to create another list based on this existing list, that captures the second-longest occurrence of each similar string. My original approach was to try and leverage the number of backslash occurrences, however, as you can see below this logic wouldn't work. Any insights would be greatly appreciated.
Code to generate existing list:
model = ["US\\Regional\\Ford\\F150", "US\\Regional\\Ford", "Europe\\UK\\England\\Aston Martin\\Vantage","Europe\\UK\\England\\Aston Martin","Asia\\Japan\\Honda\\CRV","Asia\\Japan\\Honda","Sweden\\Volvo\\XC70","Sweden\\Volvo\\"]
Desired new list:
Make |
---|
US\Regional\Ford |
Europe\UK\England\Aston Martin |
Asia\Japan\Honda |
Sweden\Volvo |
Loop through each string in model
. Call current string str1
Check whether there is another string str2
in model
such that str1
is a substring of str2
.
If yes, add str1
to a new list result
.
For example "US\\Regional\\Ford"
is a substring of "US\\Regional\\Ford\\F150"
so "US\\Regional\\Ford"
is added to result
.
model = ["US\\Regional\\Ford\\F150",
"US\\Regional\\Ford",
"Europe\\UK\\England\\Aston Martin\\Vantage",
"Europe\\UK\\England\\Aston Martin",
"Asia\\Japan\\Honda\\CRV",
"Asia\\Japan\\Honda",
"Sweden\\Volvo\\XC70",
"Sweden\\Volvo\\"]
result = []
for str1 in model:
for str2 in model:
if str1 in str2 and str1 != str2 : # is str1 a substring of str2 ?
# Also a string is a substring of itself so we have to exclude this edge case.
result.append(str1)
print(result)
Output
['US\\Regional\\Ford', 'Europe\\UK\\England\\Aston Martin', 'Asia\\Japan\\Honda', 'Sweden\\Volvo\\']
Can you test it with different lists and let me know for which lists it does not work?
model
contains "US\\Regional\\Ford\\F150"
and "US\\Regional\\Ford\\F150\\something"
, result
will contain "US\\Regional\\Ford\\F150"