I have a dataframe like this one, https://i.sstatic.net/2Sr29.png. RBD is a code that identifies each school, LET_CUR corresponds to a class and MRUN corresponds to the amount of students in each class, what i need is the following:
I would like to know how many of the schools have at least one class with more than 45 students, so far I haven't figured out yet a code to do that.
Thanks.
From your DataFrame
:
>>> import pandas as pd
>>> from io import StringIO
>>> df = pd.read_csv(StringIO("""
RBD,LET_CUR,MRUN
1,A,65
1,B,23
1,C,21
2,A,22
2,B,20
2,C,34
3,A,54
4,A,23
4,B,11
5,A,15
5,C,16
6,A,76"""))
>>> df = df.set_index(['RBD', 'LET_CUR'])
>>> df
MRUN
RBD LET_CUR
1 A 65
B 23
C 21
2 A 22
B 20
C 34
3 A 54
4 A 23
B 11
5 A 15
C 16
6 A 76
As we want to know the number of school with at leat one class having more than 45 students, we can first filter the DataFrame
on the column MRUN
and then use the nunique()
method to count the number of unique school :
>>> df_filtered = df[df['MRUN'] > 45].reset_index()
>>> df_filtered['RBD'].nunique()
3