putting a space between the 3rd and 4th characters of postal code in Python

I have a df (data frame) in Python with with a postal_code variable that has Canadian postal codes. All the postal codes are collected in a row such as k1b6j2. But , for me to use the the 'pgeocode' package i need the postal code in a standard form which is like k1b 6j2 (there must be a space between the 3rd and 4th character)

So, is there a way to replace all the postal codes with the modified ones in the df?

Many Thanks, Ben

Solution

You can simply redefine your column in the format you are expecting:

import pandas as pd 
df = pd.DataFrame({'ID':[1,2,3,4],'postal_code':['k1b6j2','k1b6j3','k1b6j4','k1b6j5']}) ##Example dataset

   ID postal_code
0   1      k1b6j2
1   2      k1b6j3
2   3      k1b6j4
3   4      k1b6j5

Redefine the postal_code column in the format you desire:

df['postal_code'] = df['postal_code'].astype(str)
df['postal_code'] = [(x[:3]+" "+x[3:]) for x in df['postal_code']]

Output:

   ID postal_code
0   1     k1b 6j2
1   2     k1b 6j3
2   3     k1b 6j4
3   4     k1b 6j5

Finally compared to some other methods in a 100k length dataframe:

df['postal_code'] = [(x[:3]+" "+x[3:]) for x in df['postal_code']] 
First method time:  0.08260965347290039
df['postal_code'] = df["postal_code"].str[:3] + " " + df["postal_code"].str[3:]
Second method time:  0.112518310546875