I have a dataset in pandas (say two class).
index | length | weight | label
-------|--------|--------|-------
0 1 2 0
1 2 3 0
2 nan 4 0
3 6 nan 0
4 30 40 1
5 45 35 1
6 18 nan 1
df.fillna(df.mean())
returns a dataframe which each nan is filled by mean of each column. But I want to fill each nan in each column with mean of its class so length at index 2 would be 3. Output is like this:
index | length | weight | label
-------|--------|--------|-------
0 1 2 0
1 2 3 0
2 3 4 0
3 6 3 0
4 30 40 1
5 45 35 1
6 18 37.5 1
Is there a simple function or I should implement it myself?
Use GroupBy.transform
with mean
for helper Dataframe
with means per groups and pass to fillna
:
df = df.fillna(df.groupby('label').transform('mean'))
print (df)
length weight label
0 1.0 2.0 0
1 2.0 3.0 0
2 3.0 4.0 0
3 6.0 3.0 0
4 30.0 40.0 1
5 45.0 35.0 1
6 18.0 37.5 1
Detail:
print (df.groupby('label').transform('mean'))
length weight
0 3.0 3.0
1 3.0 3.0
2 3.0 3.0
3 3.0 3.0
4 31.0 37.5
5 31.0 37.5
6 31.0 37.5