I am looking for the pattern which helps me to slice a string. The string is something like this:
text = '1. first slice 2. second slice 3. slice number 3 4. the next one
5 that will not work but belong to no four 5. and this should be 5 and
so one...'
I want to get this:
I hope you have got the idea.
What I have examined till now is that I can use this:
import re
parts = re.findall("\d\\. \D+", text)
That works good until it encounter single number. I know that \D expression is non digit, and I tried to use:
parts = re.findall("\d\\. .+,text)
or
parts = re.findall("(\d\\.).*,text)
and many others but I cant find the proper one.
I will be grateful for your help.
You could use a negative lookahead:
parts = re.findall(r"\d\. (?:\D+|\d(?!\.))*", text)
This matches a digit and dot, followed by anything at all, provided that any digits are not directly followed by a dot.
Demo:
>>> import re
>>> text = '1. first slice 2. second slice 3. slice number 3 4. the next one 5 that will not work but belong to no four 5. and this should be 5 and so one...'
>>> re.findall(r"\d\. (?:\D+|\d(?!\.))*", text)
['1. first slice ', '2. second slice ', '3. slice number 3 ', '4. the next one 5 that will not work but belong to no four ', '5. and this should be 5 and so one...']
Online demo at https://regex101.com/r/kF9jT1/1; to simulate the re.findall()
behaviour I added an extra (..)
and the g
flag.