Search code examples
pythonpandasmulti-index

Operate on columns with the same name in the second level in a dataframe


I have a dataframe with a multi-index on the columns:

df = pd.DataFrame({('a', 'status'): [0.1, 0.2, 0.3],
                   ('a', 'value'): [1.1, 1.2, 1.3],
                   ('b', 'status'): [0.1, 0.2, 0.3],
                   ('b', 'value'): [2.1, 2.2, 2.3],
                   ('c', 'status'): [0.1, 0.2, 0.3]})

My goal is to multiply all the value columns by a scalar, or add a scalar to them. I have been struggling to find the appropriate expression to use with direct indexing or iloc, but can't seem to find the right one. Here are some failed attempts:

>>> df[(None, 'value')] += 2
...
KeyError: 2

>>> df.iloc[:, (None, 'value')] += 2
...
IndexingError: Too many indexers

I imagine it's possible, though not very elegant to make a mask or index of the columns, so I tried:

>>> df.columns.levels[1] == 'value'
array([False,  True])

This does not help with the five actual columns that I have.


Solution

  • You need slicers :

    df.loc[:, (slice(None), "value")] +=2
    
    #another variant
    # df.loc[:, pd.IndexSlice[:, "value"]] += 2
    

    Output :

    print(df)
    
           a            b            c
      status value status value status
    0    0.1   3.1    0.1   4.1    0.1
    1    0.2   3.2    0.2   4.2    0.2
    2    0.3   3.3    0.3   4.3    0.3