How can I extract non digit characters and digit characters in the end of a string?

I have a string that has the following structure:

digit-word(s)-digit.

For example:

2029 AG.IZTAPALAPA 2

I want to extract the word(s) in the middle, and the digit at the end of the string.

I want to extract AG.IZTAPALAPA and 2 in the same capture group to extract like:

AG.IZTAPALAPA 2

I managed to capture them as individual capture groups but not as a single:

town_state['municipality'] = town_state['Town'].str.extract(r'(\D+)', expand=False)

town_state['number'] = town_state['Town'].str.extract(r'(\d+)$', expand=False)

Thank you for your help!

Solution

Yo can use a single capturing group for the example string to match a single "word" that consists of uppercase chars A-Z with an optional dot in the middle which can not be at the start or end followed by 1 or more digits.

\b\d+ ([A-Z]+(?:\.[A-Z]+)* \d+)\b

Explanation

\b A word boundary
\d+
( Capture group 1
- [A-Z]+ Match 1+ occurrences of an uppercase char A-Z
- (?:\.[A-Z]+)* \d+ Repeat 0+ times matching a dot and a char A-Z followed by matching 1+ digits
) Close group 1
\b A word boundary

Regex demo

Or you can make the pattern a bit broader matching either a dot or a word character

\b\d+ ([\w.]+(?: [\w.]+)* \d+)\b

Regex demo