I would like to split the following string at the upper/lower-case boundaries. How might I do this in Python and/or with a regex?
For example,
x = 'aagaaggagatataccATGAATTTGTCGGTTTACCCCAATTTAACCAAAgaaaacctgtacaa'
split_boundaries(x) = ['aagaaggagatatacc',
'ATGAATTTGTCGGTTTACCCCAATTTAACCAAA',
'gaaaacctgtacaa']
Use re.findall
:
import re
x = 'aagaaggagatataccATGAATTTGTCGGTTTACCCCAATTTAACCAAAgaaaacctgtacaa'
re.findall(r'[a-z]+|[A-Z]+', x)
# ['aagaaggagatatacc', 'ATGAATTTGTCGGTTTACCCCAATTTAACCAAA', 'gaaaacctgtacaa']