Search code examples
pythonpandascountmulti-index

Count one level group by another level in pandas Multinidex


I have a Pandas data frame with three level multi-index, namely date/ID/period/.

date / ID / period VALUE
2020-01-01/ 1 / 1 1
2020-01-01/ 1 / 2 2
2020-01-01/ 1 / 3 1
2020-01-01/ 2 / 1 2
2020-01-01/ 2 / 2 1
2020-01-01/ 3 / 2 2

If I want to filter on the count of id group by date, for example, in this case, how to select all the ID with at least 2 periods as of each date, to get the desired result:

date / ID COUNT
2020-01-01/ 1 3
2020-01-01/ 2 2
2020-01-01/ 3 1

Solution

  • You can use level=[0,1] in your .groupby:

    df = pd.DataFrame({'VALUE': {('2020-01-01', 1, 1): 1,
      ('2020-01-01', 1, 2): 2,
      ('2020-01-01', 1, 3): 1,
      ('2020-01-01', 2, 1): 2,
      ('2020-01-01', 2, 2): 1,
      ('2020-01-01', 3, 2): 2}})
    
    df.groupby(level=[0,1]).count()
    
    Out[1]: 
                  VALUE
    2020-01-01 1      3
               2      2
               3      1
    

    Alternatively,

    df.notnull().sum(level=[0,1])