Search code examples
pythonstringlistelementsublist

Spliting a list into sublists based on an element containing a string


I know there is a simple solution but I can't seem to find it.

I want to split a list based on an element just containing a string.

In this case "Store"

from itertools import groupby

test_string = ['Store465', 'Steve', '145658', '125', 'Brad', '457958', '200', 'Store678', 'John', '30122', '898', '123', 'O', 'Joe', '36789', '123', 'U', ' 456']

# I've tried 

test_string[:] = [x for x in test_string if "Store" not in x]

print(test_string)

# but that will just remove the  Store elements

# ['Steve', '145658', '125', 'Brad', '457958', '200', 'John', '30122', '898', '123', 'O', 'Joe', '36789', '123', 'U', ' 456']

# and

test_result = [list(g) for k,g in groupby(test_string,lambda x:x if "Store" not in x) if not k]

# This creates an error. 
#
#   File "<input>", line 13
#     test_result = [list(g) for k,g in groupby(test_string,lambda x:x if "Store" not in x) if not k]
#                                                                                         ^
# SyntaxError: invalid syntax

I've been on stackoverflow and google trying to find the correct syntax or process with no luck. My desired output would be

[['Store465', 'Steve', '145658', '125', 'Brad', '457958', '200'], ['Store678', 'John', '30122', '898', '123', 'O', 'Joe', '36789', '123', 'U', ' 456']]


Solution

  • To split the lists by the occurrence of Store in the list you could do something like this:

    test_string = ['Store465', 'Steve', '145658', '125', 'Brad', '457958', '200', 'Store678', 'John', '30122', '898', '123', 'O', 'Joe', '36789', '123', 'U', ' 456']
    
    test_loop = []
    for item in test_string:
        if 'Store' in item: # create a new list to store all of the elements after store mentioned inside of the outer list
            test_loop.append([item])
        else:  # add elements after store into the last list and before the next mention of Store
            test_loop[-1].append(item)
    print(test_loop)
    

    which gives:

    [['Store465', 'Steve', '145658', '125', 'Brad', '457958', '200'], ['Store678', 'John', '30122', '898', '123', 'O', 'Joe', '36789', '123', 'U', ' 456']]