Search code examples
python-3.xfor-looppython-re

Conditional string splitting in Python using regex or loops


I have a string c that has a respective repetitive pattern of:

  • integer from 0 to 10,
  • character S, D, or T,
  • special character * or # (optional)

For instance, c could look like 1D2S#10S, or 1D#2S*3S, or so on.

I have a further calculation to make with c, but in order to do so I thought splitting c into substrings that include integer, character, and a possible special character would be helpful. Hence, for example, 1D2S#10S would be split into 1D, 2S#, 10S. 1D#2S*3S would be split into 1D#, 2S*, 3S.

I am aware that such string split can be concisely done with re.split(), but since this is quite conditional, I wasn't able to find an optimal way to split this. Instead, I tried using a for loop:

clist = []
n = 0
for i in range(len(c)):
  if type(c[i]) != 'int':
    if type(c[i+1]) == 'int':
      clist.append(c[n:i+1])
      n = i
    else:
      clist.append(c[n:i+2])
      n = i

This raises an indexing issue, but even despite that I can tell it isn't optimal. Is there a way to use re to split it accordingly?


Solution

  • Use re.findall():

    >>> re.findall(r'\d*[SDT][\*#]?', '1D2S#10S')
    ['1D', '2S#', '10S']
    >>> re.findall(r'\d*[SDT][\*#]?', '1D#2S*3S')
    ['1D#', '2S*', '3S']