Search code examples
pythonregexlistenumerate

How to concatenate elements(strings) of a list, between two specific list positions as identified by a regex?


I have a long list of elements (thousands) which are strings, and need to capture and concatenate strings between two elements matched by a regex.

See below code, however, I am stuck on how to capture the text in between and concatenate the each element into one string?


my_list = ['this is a test element 1', 'I need to capture after this element','capture1','capture2', 'capture3','.........', 'I need to capture before this element' ]
my_reg = re.compile(r'I need to capture.+')

captured_text=[]
for i,element in enumerate(my_list):
    m=my_reg.match(element)
    if m:
        captured_text.append(my_list[i+1])

but i+1 is out of range

I hope to end up with a string capture1capture2capture3.....


Solution

  • match_indices = [i for i, s in enumerate(my_list) if my_reg.match(s)]
    captured_text = my_list[min(match_indices)+1 : max(match_indices)]
    

    The result:

    >>> captured_text
    ['capture1', 'capture2', 'capture3', '.........']
    >>> "".join(captured_text)
    'capture1capture2capture3.........'