I'm trying to use regex in Python to match acronyms separated by periods. I have the following code:
import re
test_string = "U.S.A."
pattern = r'([A-Z]\.)+'
print re.findall(pattern, test_string)
The result of this is:
['A.']
I'm confused as to why this is the result. I know + is greedy, but why is are the first occurrences of [A-Z]\. ignored?
The (...)
in regex creates a group. I suggest changing to:
pattern = r'(?:[A-Z]\.)+'