Search code examples
pythonpandasdataframepandas-groupby

pandas groupby count the number of zeros in a column


I have a dataframe, e.g:

Date             B           C   
20.07.2018      10           8
20.07.2018       1           0
21.07.2018       0           1
21.07.2018       1           0

How can I count the zero and non-zero values for each column for each date? Using .sum() doesn't help me because it will sum the non-zero values.

e.g: expected output for the zero values:

Date          B         C
20.07.2018    0         1
21.07.2018    1         1

Solution

  • I believe need DataFrameGroupBy.agg with compare by 0 and sum:

    a) To count no. of zero values:

    df1 = df.groupby('Date').agg(lambda x: x.eq(0).sum())
    print (df1)
    
                B  C
    Date            
    20.07.2018  0  1
    21.07.2018  1  1
    

    b) To count no. of non-zero values:

    df2 = df.groupby('Date').agg(lambda x: x.ne(0).sum())
    print (df2)
                B  C
    Date            
    20.07.2018  2  1
    21.07.2018  1  1
    

    Another idea for improve performance is create DatetimeIndex, compare columns and last use sum per level (DatetimeIndex):

    df1 = df.set_index('Date').eq(0).sum(level=0)
    print (df1)
                B  C
    Date            
    20.07.2018  0  1
    21.07.2018  1  1
    
    df2 = df.set_index('Date').ne(0).sum(level=0)
    print (df2)
                B  C
    Date            
    20.07.2018  2  1
    21.07.2018  1  1