Search code examples
pandaspandas-groupbydata-visualizationpivot-table

Pandas function for showing aggfunc at every level


Let's propose I have a pivot table that looks like this:

pd.pivot_table(
  data, 
  columns=['A','B','C'],
  values='widgets',
  aggfunc='count'
).T
[Column] Count
A  B  C  1
      D  2
   E  F  3
      G  4
H  I  J  5
   K  L  6

What I want is:

A  10  B  3  C  1
             D  2
       E  7  F  3
             G  4
H  11  I  11 J  5
             L  6

with intermediary sums of each category in between the final count.


Solution

  • Make sure index levels are named:

    df = pd.DataFrame(
        {'Count': [1, 2, 3, 4, 5, 6]},
        pd.MultiIndex.from_tuples([
            ('A', 'B', 'C'),
            ('A', 'B', 'D'),
            ('A', 'E', 'F'),
            ('A', 'E', 'G'),
            ('H', 'I', 'J'),
            ('H', 'K', 'L')
        ], names=['One', 'Two', 'Three'])
    )
    
    df
    
                   Count
    One Two Three       
    A   B   C          1
            D          2
        E   F          3
            G          4
    H   I   J          5
        K   L          6
    

    from functools import reduce
    import pandas as pd
    
    names = df.index.names
    reduce(
        pd.DataFrame.join,
        [df.groupby(level=names[:i+1]).sum().add_suffix(f'_{names[i]}')
         for i in range(df.index.nlevels)]
    )
    
                   Count_One  Count_Two  Count_Three
    One Two Three                                   
    A   B   C             10          3            1
            D             10          3            2
        E   F             10          7            3
            G             10          7            4
    H   I   J             11          5            5
        K   L             11          6            6