Search code examples
pythonpython-3.xpandasgroup-bymulti-index

Summing over a multiindex level in a pandas series


I would like to sum (marginalize) over one level in a series with a 3-level multiindex to produce a series with a 2 level multiindex. For example, if I have the following:

ind = [tuple(x) for x in ['ABC', 'ABc', 'AbC', 'Abc', 'aBC', 'aBc', 'abC', 'abc']]
mi = pd.MultiIndex.from_tuples(ind)
data = pd.Series([264, 13, 29, 8, 152, 7, 15, 1], index=mi)

A  B  C    264
      c     13
   b  C     29
      c      8
a  B  C    152
      c      7
   b  C     15
      c      1

I would like to sum over the C variables to produce the following output:

A  B    277
   b     37
a  B    159
   b     16

What is the best way in Pandas to do this?


Solution

  • If you know you always want to aggregate over the first two levels, then this is pretty easy:

    In [27]: data.groupby(level=[0, 1]).sum()
    Out[27]:
    A  B    277
       b     37
    a  B    159
       b     16
    dtype: int64