Search code examples
pythondataframemulti-index

Accessing Columns in a Multi Index


I have a data frame as shown below.

enter image description here

It has a Multiindex of sorts, with the first three columns separate. I want to apply a lambda function to the "Class" column and create a new adjacent column with the output.

I have tried

df['newcol'] = df.apply(lambda x: newfunc(x) for x in df['Class'])

This seems to encounter difficulties, which I think are associated with the multilevel indexing. How would I go about doing such a process? Furthermore, how would I insert the new column in adjacent to the "Class" column?


Solution

  • Class is not a column of the DF it is a level (2) of the multi-index and so has to be accessed using such as:

    df.index.get_level_values('Class')
    

    This can be used as required to form a new column - such as df['new']

    if you want as you this new column to become an additional level of the multi-index which you are maybe suggesting then you can use:

    df = df.set_index('new', append = True)
    

    Alternatively, you could revert from a multi-index to just normal columns using the line below and then form new column(s) as required and if then required reform a multi-index from the relevant columns.

    df= df.reset_index()