I have a data-frame like this,
df,
A B C D Final
a b c d Valid
a c Valid
a c d Valid
a Valid
I want to calculate how many % of each column present in the Final Column.
My desired output is,
output = a=4,b=1,c=3,d=2
Please help
If empty values are missing use drop
with count
:
print (df)
A B C D Final
0 a b c d Valid
1 a NaN c NaN Valid
2 a NaN c d Valid
3 a NaN NaN NaN Valid
df = df.drop('Final', axis=1).count()
print (df)
A 4
B 1
C 3
D 2
dtype: int64
If values are empty strings first compare by eq
and sum
True
s:
print (df)
A B C D Final
0 a b c d Valid
1 a c Valid
2 a c d Valid
3 a Valid
df = df.drop('Final', axis=1).ne('').sum()
print (df)
A 4
B 1
C 3
D 2
dtype: int64
print (df.to_dict())
{'B': 1, 'A': 4, 'C': 3, 'D': 2}
d = df.div(len(df.index)).mul(100).to_dict()
print (d)
{'B': 25.0, 'A': 100.0, 'C': 75.0, 'D': 50.0}