Search code examples
pythonpandasdataframemulti-index

Replace specific values in multiindex dataframe


I have a multindex dataframe with 3 index levels and 2 numerical columns.

A   1   2017-04-01  14.0    87.346878
        2017-06-01  4.0     87.347504
    2   2014-08-01  1.0     123.110001
        2015-01-01  4.0     209.612503
B   3   2014-07-01  1.0     68.540001
        2014-12-01  1.0     64.370003
    4   2015-01-01  3.0     75.000000

I want to replace the values in first row of 3rd index level wherever a new second level index begins. For ex: every first row

(A,1,2017-04-01)->0.0   0.0 
(A,2,2014-08-01)->0.0   0.0  
(B,3,2014-07-01)->0.0   0.0  
(B,4,2015-01-01)->0.0   0.0

The dataframe is too big and doing it datframe by dataframe like df.xs('A,1')...df.xs(A,2) gets time consuming. Is there some way where i can get a mask and replace with new values in these positions ?


Solution

  • Use DataFrame.reset_index on level=2, then use DataFrame.groupby on level=[0, 1] and aggregate level_2 using first, then using pd.MultiIndex.from_arrays create a multilevel index, finally use this multilevel index to change the values in dataframe:

    idx = df.reset_index(level=2).groupby(level=[0, 1])['level_2'].first()
    idx = pd.MultiIndex.from_arrays(idx.reset_index().to_numpy().T)
    df.loc[idx, :] = 0
    

    Result:

    # print(df)
                   col1        col2
    A 1 2017-04-01  0.0    0.000000
        2017-06-01  4.0   87.347504
      2 2014-08-01  0.0    0.000000
        2015-01-01  4.0  209.612503
    B 3 2014-07-01  0.0    0.000000
        2014-12-01  1.0   64.370003
      4 2015-01-01  0.0    0.000000