I have created this pandas dataframe:
ds = {"col1":["ROSSI Mauro", "Luca Giacomini", "Sonny Crockett"]}
df = pd.DataFrame(data=ds)
Which looks like this:
print(df)
col1
0 ROSSI Mauro
1 Luca Giacomini
2 Sonny Crockett
Let's take a look at the column col1
, which contains some names and last names (in different order).
If a string is in all UPPER case (for example, like ROSSI
in record 0), then it is a last name and I need to move it after the non all-upper case string.
So, the resulting dataframe would look like this:
col1
0 Mauro ROSSI
1 Luca Giacomini
2 Sonny Crockett
Does anyone know how to identify the all-upper case string in col1 and move it after the non all-upper case string?
We can also use captured groups with regex in str.replace
:
df['col1 new'] = df['col1'].str.replace('([A-Z]+)\\b(.*)', '\\2 \\1')
Output:
col1 col1 new
0 ROSSI Mauro Mauro ROSSI
1 Luca Giacomini Luca Giacomini
2 Sonny Crockett Sonny Crockett
Using the () to make a captured group, with \b as a word boundary, we can use \2 and \1 to reorder the groups. With more complex data, you'll probably have to adjust your regex.