I have the following string:
'Cc1cc([N+](=O)[O-])ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1'
and want to capture [N+] and [O-], that is, splitting and recovering them. I do not seem to be able to recover them by using re.split.
re.split(r'\[[^\]]*\]','Cc1cc([N+](=O)[O-])ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1')
output:
['Cc1cc(', '(=O)', ')ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1']
and I am looking for something like this:
['Cc1cc(', '[N+]','(=O)','[O-]', ')ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1']
I am aware of edits like: Splitting on regex without removing delimiters or In Python, how do I split a string and keep the separators?
If you apply the function re.split wrapping your function with parenthesis you get the desired output:
s = 'Cc1cc([N+](=O)[O-])ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1'
re.split('(\[[^\]]*\])',s)
output :
['Cc1cc(', '[N+]', '(=O)', '[O-]', ')ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1']