Search code examples
python-3.xpandasnumpydataframefillna

Replace NaN in pandas DataFrame with random strings without using fillna


I have a pandas DataFrame like below

      NAME      EMAIL      HEIGHT      WEIGHT

1     jlka       NaN        170          70

2     qwer     eee@ttt      180          80

3     ioff       NaN        175          75

4     iowu     iou@add      170          60

And I want to replace NaN in 'EMAIL' column with random strings with no duplicates, which does not necessarily contain @.

I have tried to make a def which generates random strings but NaNs were replaced with the same random string since I used 'fillna' method after all.

It seems like, and as I saw other Q$As, def in fillna works only once and replace all the NaN with the same values or strings came out from the def.

Should I try 'for' sentence to replace them one by one?

Or is there a more Pythonic way to replace them?


Solution

  • you could try something like this:

    import pandas as pd
    from numpy import nan
    import random
    import string
    
    df = pd.DataFrame({
        'Name': ['aaa','bbb','CCC'],
        'Email': [nan,'ddd',nan]})
    
    def processNan (x):
        return ''.join(random.choice(string.ascii_uppercase + string.digits) for x in range(10))
    
    df['Email'] = df['Email'].apply(lambda x: processNan(x) if x is nan else x)