I'm curious about the result of this Python code that does a split on an empty string ''
:
import re
x = re.split(r'\W*', '')
y = re.split(r'(\W*)', '')
Since the string is an empty string, I expect the result for x = re.split(r'\W*', '')
is an empty list and that for y = re.split(r'(\W*)', '')
is ['']
.
The actual result for x = re.split(r'\W*','')
is ['','']
and that for y = re.split(r'(\W*)','')
is ['','','']
.
I don't know what leads to these results.
Note that the regular expression \W*
can match an empty string. Thus, while it's not useful, it's true that the empty string can be split in half to produce an empty string:
'' = '' + '' + ''
''
that precedes the regular expression''
that matches the regular expression''
that follows the regular expressionIn the first case, you get strings 1 and 3.
In the second case, you also get string 2.
(In general, it's probably never a good idea to use a regular expression that can mach the empty string as the first argument.)