I have a table with NaN.
import pandas as pd
data = {'name': ['may','may', 'mary', 'james','james','john','paul', 'paul', 'joseph'],
'email' : ['may@gmail.com','NaN','Mary@gmail.com','James@gmail.com','NaN','NaN','Paul@gmail.com','NaN','NaN']}
df = pd.DataFrame(data)
BEFORE
DESIRE OUTPUT
But, when I use ffill
, I ended up with this which is incorrect. Is there a way I can use ffill
but with conditions?
In your example, NaN
values are strings, with value "NaN"
. So before you fillna, you'd have to convert those to actual null values.
import pandas as pd
import numpy as np
data = {'name': ['may','may', 'mary', 'james','james','john','paul', 'paul', 'joseph'],
'email' : ['may@gmail.com','NaN','Mary@gmail.com','James@gmail.com','NaN','NaN','Paul@gmail.com','NaN','NaN']}
df = pd.DataFrame(data)
df['email'] = df['email'].replace({'NaN':np.nan})
df['email'] = df.groupby('name')['email'].fillna(method='ffill')
df
name email
0 may may@gmail.com
1 may may@gmail.com
2 mary Mary@gmail.com
3 james James@gmail.com
4 james James@gmail.com
5 john NaN
6 paul Paul@gmail.com
7 paul Paul@gmail.com
8 joseph NaN