I'm trying to do this:
import re
sentence = "How are you?"
print(re.split(r'\b', sentence))
The result being
[u'How are you?']
I want something like [u'How', u'are', u'you', u'?']
. How can this be achieved?
Unfortunately, Python cannot split by empty strings.
To get around this, you would need to use findall
instead of split
.
Actually \b
just means word boundary.
It is equivalent to (?<=\w)(?=\W)|(?<=\W)(?=\w)
.
That means, the following code would work:
import re
sentence = "How are you?"
print(re.findall(r'\w+|\W+', sentence))