Search code examples
pythonpandasdataframenumpymulti-index

Set cross section of pandas MultiIndex to DataFrame from addition of other cross sections


I am currently trying to assign rows with certain indices based on the other indices within the group.

Consider the following pandas data frame:

import pandas as pd
import numpy as np

index = pd.MultiIndex.from_product([list('abc'), ['aa', 'bb', 'cc']])
df = pd.DataFrame({'col1': np.arange(9), 
                   'col2': np.arange(9, 18), 
                   'col3': np.arange(18,27)}, 
                   index=index)

Output of df:

      col1  col2  col3
a aa     0     9    18
  bb     1    10    19
  cc     2    11    20
b aa     3    12    21
  bb     4    13    22
  cc     5    14    23
c aa     6    15    24
  bb     7    16    25
  cc     8    17    26

I want to assign the indices 'cc' equal to 'aa' plus 'bb' per the first level of indices.

The following works fine, but I am wondering if there is a way to set values without having to reference the underlying NumPy array.

df.loc[pd.IndexSlice[:, 'cc'], :] = (df.xs('aa', level=1) 
                                     + df.xs('bb', level=1)).values

Is there a way to set the 'cc' rows directly to the output below? I believe the issue with trying to set the below directly is due to an index mismatch. Can I get around this somehow?

df.xs('aa', level=1) + df.xs('bb', level=1)

Solution

  • Update

    You can use pandas.DataFrame.iloc

    df.iloc[df.index.get_level_values(1)=='cc'] = df.xs('aa', level=1) + df.xs('bb', level=1)
    

    Old answer

    You can do this:

    df[df.index.get_level_values(1)=='cc'] = df.xs('aa', level=1) + df.xs('bb', level=1)
    

    Disclaimer: it works for pandas version 1.2.1 and it doesn't work in pandas 1.2.3. I haven't tested any other version