Search code examples
pythonpandasmultiple-columnsadditionmulti-index

How to apply to one set of columns in a dataframe with multi-index columns


I have a dataframe with multi-index column as you see below, and what I am trying to do is for each user and for m1 and m2, I need to get the value for today+0.25*value for yesterday.

               m1              m2  
day           yesterday today yesterday today
user                                      
id1           5         6     7         8
id2           3         4     9         10

I have tried the following, but I get NA values in all "today" columns:

df.iloc[:, df.columns.get_level_values(1)=='today'] = 
df.iloc[:, df.columns.get_level_values(1)=='yesterday'] *0.25 +
 df.iloc[:, df.columns.get_level_values(1)=='today']

My search on other stackoverflow posts has led me to possible answer of grouping by level = 0, but I don't know how to go from there and how to apply a function to the result of the group by. Or perhaps that is not the right way to approach this?

df.groupby(level=0, axis=1).apply(...)

Firstly, what I need to get to is :

               m1                      m2  
day           yesterday today          yesterday today
user                                      
id1               5     6+0.25*5       7         8+0.25*7
id2               3     4+0.25*3       9         10+0.25*9

Ultimately, I need to get to :


user           m1              m2             
id1            6+0.25*5        8+0.25*7
id2            4+0.25*3        10+0.25*9

p.s. this is my first time asking question on StackOverflow, I did my best ! but please let me know if I need to modify my question to follow the guidelines. Thank you!


Solution

  • Use DataFrame.xs for selecting, so is possible add DataFrames together with multiple by constant.

    Advantage of function is removed top levels (default value drop_level=True) - so working with no MultiIndex in output DataFrames.

    print (df.xs('today', axis=1, level=1))
         m1  m2
    id1   6   8
    id2   4  10
    
    print (df.xs('yesterday', axis=1, level=1))
         m1  m2
    id1   5   7
    id2   3   9
    
    df1 = df.xs('today', axis=1, level=1) + 0.25 *df.xs('yesterday', axis=1, level=1)
    print (df1)
           m1     m2
    id1  7.25   9.75
    id2  4.75  12.25