Search code examples
pythonpandasseriesrolling-computation

Pandas dataframe get expanding/rolling value counts


Is it possible to create value counts for a series in a cumulative/expanding fashion? E.g. for the following series

pd.Series(['a', 'a', 'b', 'c', 'c'])

I'd want

[ 'a', 'b', 'c'
   1,   0,  0
   2,   0,   0 
   2,   1,   0
   2,   1,   1
   2,   1,   2]

Solution

  • Encode the series into indicator variables and use cumsum

    s.str.get_dummies().cumsum()
    

       a  b  c
    0  1  0  0
    1  2  0  0
    2  2  1  0
    3  2  1  1
    4  2  1  2