Search code examples
pythonanacondaspyder

Conditional Exclusion Based on Substring


New to Python/Spyder and have a scenario where I have an existing list (we can call it Model). What I'm trying to do is to create another list based on this existing list, that captures the second-longest occurrence of each similar string. My original approach was to try and leverage the number of backslash occurrences, however, as you can see below this logic wouldn't work. Any insights would be greatly appreciated.

Code to generate existing list:


model = ["US\\Regional\\Ford\\F150", "US\\Regional\\Ford", "Europe\\UK\\England\\Aston Martin\\Vantage","Europe\\UK\\England\\Aston Martin","Asia\\Japan\\Honda\\CRV","Asia\\Japan\\Honda","Sweden\\Volvo\\XC70","Sweden\\Volvo\\"]

Desired new list:

Make
US\Regional\Ford
Europe\UK\England\Aston Martin
Asia\Japan\Honda
Sweden\Volvo

Solution

    • Loop through each string in model. Call current string str1

    • Check whether there is another string str2 in model such that str1 is a substring of str2.

    • If yes, add str1 to a new list result.

    For example "US\\Regional\\Ford" is a substring of "US\\Regional\\Ford\\F150" so "US\\Regional\\Ford" is added to result.

    model = ["US\\Regional\\Ford\\F150",
             "US\\Regional\\Ford",
             "Europe\\UK\\England\\Aston Martin\\Vantage",
             "Europe\\UK\\England\\Aston Martin",
             "Asia\\Japan\\Honda\\CRV",
             "Asia\\Japan\\Honda",
             "Sweden\\Volvo\\XC70",
             "Sweden\\Volvo\\"]
    
    result = []
    for str1 in model:
        for str2 in model:
            if str1 in str2 and str1 != str2 : # is str1 a substring of str2 ?
    # Also a string is a substring of itself so we have to exclude this edge case.
                result.append(str1)
    
    print(result)
    

    Output

    ['US\\Regional\\Ford', 'Europe\\UK\\England\\Aston Martin', 'Asia\\Japan\\Honda', 'Sweden\\Volvo\\']
    

    Can you test it with different lists and let me know for which lists it does not work?

    Note

    • If model contains "US\\Regional\\Ford\\F150" and "US\\Regional\\Ford\\F150\\something", result will contain "US\\Regional\\Ford\\F150"
    • You now have to replace "\\" in each string with "\"