Search code examples
pythonpandasdataframemulti-indexcalculated-columns

Create new calculated columns in multi index data frame based on original column names while retaining level-two format


I have a large multi-indexed data frame with many rows and columns. I want to add new columns, with names based on the original level one column names. I want to keep the level two format of the multi-index for each newly created column. The new columns calculate changes and percent changes between columns. Ideally I'd like this to be done automatically so I don't have to create new columns and column names manually.

Original:

import numpy as np
import pandas as pd


data = [[99,3,12,4,63,55]]

cols = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23'],['Values','Sites']])

df = pd.DataFrame(data, columns = cols)

print(df)


enter image description here

Desired Output:

data_new = [[99,3,-36,52,-36,1733,12,4,51,51,425,1275,63,55]]

cols_new = pd.MultiIndex.from_product([['1. FY21','FY23-FY21','FY23-FY21_ % Change','2. FY22','FY23-FY22','FY23-FY22_ % Change','3. FY23'],['Values','Sites']])

df_new = pd.DataFrame(data_new, columns = cols_new)

print(df_new)

enter image description here


Solution

  • Try:

    cols = df.columns.get_level_values(0).unique()
    last = df.xs(cols[-1], level=0, axis=1)
    n = cols[-1].split()[-1]
    
    all_dfs = []
    for c in cols[:-1]:
        o = df.xs(c, level=0, axis=1)
    
        d = last - o
        d.columns = pd.MultiIndex.from_product([[f"{n}-{c.split()[-1]}"], d.columns])
    
        ch = (last / o - 1) * 100
        ch.columns = pd.MultiIndex.from_product(
            [[f"{n}-{c.split()[-1]}_% Change"], ch.columns]
        )
    
        o.columns = pd.MultiIndex.from_product([[c], o.columns])
    
        all_dfs.extend([o, d, ch])
    
    last.columns = pd.MultiIndex.from_product([[cols[-1]], last.columns])
    all_dfs.append(last)
    
    out = pd.concat(all_dfs, axis=1)
    print(out)
    

    Prints:

      1. FY21       FY23-FY21       FY23-FY21_% Change              2. FY22       FY23-FY22       FY23-FY22_% Change         3. FY23      
       Values Sites    Values Sites             Values        Sites  Values Sites    Values Sites             Values   Sites  Values Sites
    0      99     3       -36    52         -36.363636  1733.333333      12     4        51    51              425.0  1275.0      63    55