this is my Dataframe
from cmath import nan
student_card = pd.DataFrame({'ID':[20190103, 20190222, 20190531],
'name':['Kim', nan, nan],
'class':['H', 'W', 'S']})
student_card
it looks like this
so there are two NaN values in 'name' columns, and I want to fill them as 'missing1', 'missing2' using loop (idk not using loop but have no idea how to index them without loop)
so I made this function and got stuck over here. It doesn't work please give me some helps, thanks
import pandas as pd
def fillna_func(df):
df = df.copy()
for i, value in enumerate(df.values):
if value == nan:
df[i].apply("deleted{}".format(i))
return df
fillna_func(student_card['name'])
You could create a mask where the name is null, and filter the main dataframe by that. Then update those names using using the cumulative sum of the missing values.
import numpy as np
import pandas as pd
student_card = pd.DataFrame({'ID':[20190103, 20190222, 20190531],
'name':['Kim', np.nan, np.nan],
'class':['H', 'W', 'S']})
def fillna_func(df):
m = df.name.isnull()
df.loc[m, 'name'] = 'missing' + m.cumsum().astype(str)
return df
fillna_func(student_card)
Output
ID name class
0 20190103 Kim H
1 20190222 missing1 W
2 20190531 missing2 S