Search code examples
pythonpandasgroup-by

pandas: Access column-like results from "groupby" and "agg"?


I am using groupby and agg to summarize groups of dataframe rows. I summarize each group in terms of its count and size:

>>> import pandas as pd
>>> df = pd.DataFrame([
   [ 1, 2, 3 ],
   [ 2, 3, 1 ],
   [ 3, 2, 1 ],
   [ 2, 1, 3 ],
   [ 1, 3, 2 ],
   [ 3, 3, 3 ] ],
   columns=['A','B','C'] )

>>> gbB = df.groupby('B',as_index=False)
>>> Cagg = gbB.C.agg(['count','size'])
      B  count  size
   0  1      1     1
   1  2      2     2
   2  3      3     3

The result looks like a dataframe with columns for the grouping variable B and for the summaries count and size:

>>> Cagg.columns
    Index(['B', 'count', 'size'], dtype='object')

However, I can't access each of the count and size columns for further manipulation as series or by conversion to_list:

>>> Cagg.count
   <bound method DataFrame.count of    B  count  size
   0  1      1     1
   1  2      2     2
   2  3      3     3>

>>> Cagg.size
   9

Can I access the individual column-like data with headings count and size?


Solution

  • Don't use attributes to access the columns, this conflicts with the existing methods/properties.

    Go with indexing using square brackets:

    Cagg['count']
    
    # 0    1
    # 1    2
    # 2    3
    # Name: count, dtype: int64
    
    Cagg['size']
    
    # 0    1
    # 1    2
    # 2    3
    # Name: size, dtype: int64