Search code examples
pythonpandasbinning

Way to bin a dataframe into deciles based on sum of another column


Using pandas, I'm trying to bin a dataframe into deciles using a ranked score (x), such that each decile contains equal values based on the sum of a different column (y).

In otherwords, it will fill each decile until it reaches a certain value (sum of y // 10), then go to the next decile.

I've tried using cut and qcut, but it only splits by x, not by the values in y.


Solution

  • You can do with cumsum + groupby

    d={x: y for x , y df.groupby(df.y.cumsum()//10)}
    d[0]