Search code examples
pythondatabricksspark-koalas

How to get number of groups in a groupby object in koalas?


How to get number of groups in a groupby object in koalas ?

In pandas we can use ngroups, but this method is not implemented yet in koalas.

Suppose groupby object is called dfgroup.

Any idea ?


Solution

  • You need to force execution of some function on that GroupBy object (type: databricks.koalas.groupby.DataFrameGroupBy), and from that you can get a length. For example, the dfgroup.size() will return databricks.koalas.series.Series on which you can call len (example is adapted from documentation):

    >>> grouped = df.groupby(['Animal'])
    >>> type(grouped.size())
    <class 'databricks.koalas.series.Series'>
    >>> len(grouped.size())
    2