I have a dataframe with multiple columns and I want to separate the numbers from the letters with a space in one column.
In this example I want to add space in the third column.
do you know how to do so?
import pandas as pd
data = {'first_column': ['first_value', 'second_value', 'third_value'],
'second_column': ['first_value', 'second_value', 'third_value'],
'third_column':['AA6589', 'GG6589', 'BXV6589'],
'fourth_column':['first_value', 'second_value', 'third_value'],
}
df = pd.DataFrame(data)
print (df)
Use str.replace
with a short regex:
df['third_column'] = df['third_column'].str.replace(r'(\D+)(\d+)',
r'\1 \2', regex=True)
regex:
(\D+) # capture one or more non-digits
(\d+) # capture one or more digits
replace with \1 \2
(first captured group, then space, then second captured group).
Alternative with lookarounds:
df['third_column'] = df['third_column'].str.replace(r'(?<=\D)(?=\d)',
' ', regex=True)
meaning: insert a space at any position in-between a non-digit and a digit.