I am using pandas and getting data and using group by. However I am not sure how to filter out unwanted data from getting returned. I only want the count for 1's and not 0's.
print(df.groupby(item)['event'].apply(lambda x: (x==1).sum()).reset_index(name='count'))
hpb count
0 55
1 36
I've also used the following and it's returning the same info for all columns.
print(patients.query("event==1 and hbp==1").groupby('hbp').count())
age cp eject ......
hbp
1 36 36 36
My desired output is below.
hpb count
1 36
Here is my data I am reading into the dataframe patients. I am trying to get the hbp data where event is 1 and hbp is 1 using query or group by.
Currently I am getting both 1
and 0
counts but I only display the hbp 1 counts. see desired output above.
hbp event ..etc
1 1
0 1
1 1
1 1
0 1
Your solution is possible change with specify column after groupby
and use sum
or count
or size
, because processing 1
values:
print(patients.query("event==1 and hbp==1")
.groupby('hbp')['event']
.sum()
.reset_index(name='count'))
Or first filter and then count hbp
values by Series.value_counts
:
print (patients.query("event==1 and hbp==1")['hbp']
.value_counts()
.reset_index(name='count'))
hbp count
0 1 36