string1 = '%(example_1).40s-a%(example-2)s_-%(example3)s_s1'
output
'-a', '_-', '_s1'
Need to remove all selection between '%' and 's'
Attempt 1:
re.findall("[-_a-z0-9]+(?![^%]*\s)", string1)
result:
['example_1', '0s-a', 'example-', 's_-', 'example', 's_s1']
Attempt 2:
re.findall("[-_a-z0-9]+(?![^(]*\))", string1)
result:
['40s-a', 's_-', 's_s1']
attempt 2 is sorta close expect it matched '40s' which is between % & s. and overmatched 's' in the other entries.
expected output
['-a', '_-', '_s1']
EDIT:
Want to confirm how to not search between % & s.
string2 = 'abc123%(example_1).40s-a%(example-2)s_-%(example3)s_s1'
expected output: ['abc123', '-a', '_-', '_s1'
string3 = 'abc123%(example_1).40s-a%(example-2)s_-%(examples3).40s'
expected output: ['abc123', '-a', '_-']
I would rather use the "negative" approach, with re.split
using non-greedy match to match chars between %
and s
: the regex is then very simple
Only kludge: you need to filter empty fields (start of the string)
import re
result = [x for x in re.split("%.*?s",'%(example_1).40s-a%(example-2)s_-%(example3)s_s1') if x]
print(result)
result:
['-a', '_-', '_s1']
edit: that simple expression doesn't work if parentheses contain "s" character, you can then replace the expression by a more complex one:
%\(.*?\).*?s|%.*?s
(which is an expression requiring parentheses OR the previous simple expression: allows to match even if no parentheses)