Search code examples
pythonpython-3.xalibaba-cloud

How can I find cumulative count within a group using Alibaba PyODPS?


Let us consider I have a data frame named Iris with name, sepallength, sepalwidth, petalwidth and petallength as columns. I want to find the cumulative count of sepallength within a group.

My code:

iris['name', 'sepallength', iris.groupby('name').sort('sepallength').sepallength.count()].head(5)

But it is showing the wrong result, what I am missing?


Solution

  • Use cumcount instead of count, the previous one is for window function while the later one is for aggregation.

    iris['name', 'sepallength', iris.groupby('name').sort('sepallength').sepallength.cumcount()].head(5)