I have a pandas DataFrame like below
NAME EMAIL HEIGHT WEIGHT
1 jlka NaN 170 70
2 qwer eee@ttt 180 80
3 ioff NaN 175 75
4 iowu iou@add 170 60
And I want to replace NaN in 'EMAIL' column with random strings with no duplicates, which does not necessarily contain @.
I have tried to make a def which generates random strings but NaNs were replaced with the same random string since I used 'fillna' method after all.
It seems like, and as I saw other Q$As, def in fillna works only once and replace all the NaN with the same values or strings came out from the def.
Should I try 'for' sentence to replace them one by one?
Or is there a more Pythonic way to replace them?
you could try something like this:
import pandas as pd
from numpy import nan
import random
import string
df = pd.DataFrame({
'Name': ['aaa','bbb','CCC'],
'Email': [nan,'ddd',nan]})
def processNan (x):
return ''.join(random.choice(string.ascii_uppercase + string.digits) for x in range(10))
df['Email'] = df['Email'].apply(lambda x: processNan(x) if x is nan else x)