These is a string like this:
strs = "Tierd-Branden This is (L.A.) 105 / New (Even L.A.A)"
After trying the following code, I don't get my expected output.
and this is my code:
import re, itertools
strs = "Tierd-Branden This is (U.C.) 105 / New (Even L.A.A)"
print re.findall(r"[\w']+[\w\.]", strs)
I expect This:
['Tierd', 'Branden', 'This', 'is', 'L.A.', '105', 'New', 'Even', 'L.A.A']
But, I get this:
['Tierd', 'Branden', 'This', 'is', 'L.', 'A.', '105', 'New', 'Even', 'L.', 'A.']
My question is how to keep content of parenthesis with . linked as a list element?
The [\w']+[\w\.]
pattern matches 1 or more word or '
chars and then a word or .
char. Hence, it cannot match chunks of word or '
chars that have more than 1 dot in them.
I suggest using
r"\w[\w'.]*"
See the regex demo and a Regulex graph:
Details
\w
- a word char[\w'.]*
- 0 or more word, '
and .
chars.