My question might be too easy for many of you but since i'm a beginner with Python..
I want to have the % by value of a column containing 3 different possible values (1,0,-1) but by excluding one of the values in the column (which is -1).
I did this : (df['col_name']).sum()/len(df.col_name)
However it also counts the -1 in it, while i just want to have the % of the value 1/total sum, but without -1 in the total sum.
Thank you for your help.
For exclude values replace -1
to missing values:
df['col_name'].replace(-1, np.nan).sum()/len(df.col_name)
Or filter out -1
values if need count lengths of filtered Series:
np.random.seed(123)
df = pd.DataFrame({'col_name':np.random.choice([0,1,-1], size=10)})
print (df)
col_name
0 -1
1 1
2 -1
3 -1
4 0
5 -1
6 -1
7 1
8 -1
9 1
s = df.loc[df['col_name'] != -1, 'col_name']
print (s)
1 1
4 0
7 1
9 1
Name: col_name, dtype: int32
print (s.sum()/len(s))
0.75
print (s.mean())
0.75