I want to fill the missing data of gender in proportion in a data set.
i use boolean index and head or tail function to select the top data i want, but when i use fillna function, it doesn't work.but after i try, it only run without boolean index, how can i get the top 3 empty values in example and fill it with 0.
a = pd.DataFrame(np.random.randn(50).reshape((10,5)))
a[0][1,3,4,6,9] = np.nan
a[0][a[0].isnull()].head(3).fillna(value = '0', inplace = True)
the dataframe didn't fill the NaN
Starting with data:
a = pd.DataFrame(np.random.randn(50).reshape((10,5)))
a[0][1,3,4,6,9] = np.nan
gives
0 1 2 3 4
0 -0.388759 -0.660923 0.385984 0.933920 0.164083
1 NaN -0.996237 -0.384492 0.191026 -1.168100
2 -0.773971 0.453441 -0.543590 0.768267 -1.127085
3 NaN -1.051186 -2.251681 -0.575438 1.642082
4 NaN 0.123432 1.063412 -1.556765 0.839855
5 -1.678960 -1.617817 -1.344757 -1.469698 0.276604
6 NaN -0.813213 -0.077575 -0.064179 1.960611
7 1.256771 -0.541197 -1.577126 -1.723853 0.028666
8 0.236197 0.868503 -1.304098 -1.578005 -0.632721
9 NaN -0.227659 -0.857427 0.010257 -1.884986
Now you want to work on column zero so we use fillna with a limit of 3 and replace that column inplace
a[0].fillna(0, inplace=True, limit=3)
gives
0 1 2 3 4
0 -0.388759 -0.660923 0.385984 0.933920 0.164083
1 0.000000 -0.996237 -0.384492 0.191026 -1.168100
2 -0.773971 0.453441 -0.543590 0.768267 -1.127085
3 0.000000 -1.051186 -2.251681 -0.575438 1.642082
4 0.000000 0.123432 1.063412 -1.556765 0.839855
5 -1.678960 -1.617817 -1.344757 -1.469698 0.276604
6 NaN -0.813213 -0.077575 -0.064179 1.960611
7 1.256771 -0.541197 -1.577126 -1.723853 0.028666
8 0.236197 0.868503 -1.304098 -1.578005 -0.632721
9 NaN -0.227659 -0.857427 0.010257 -1.884986