For a DataFrame with two MultiIndex levels age
and yearref
, the goal is to add a new MultiIndex level yearconstr
calculated as yearconstr = yearref - age
.
import pandas as pd
df = pd.DataFrame({"value": [1, 2, 3]},
index=pd.MultiIndex.from_tuples([(10, 2015), (3, 2015), (2, 2016)],
names=["age", "yearref"]))
print(df)
# input df:
value
age yearref
10 2015 1
3 2015 2
2 2016 3
We could reset the index, calculate a new column and then put the original index back in place plus the newly defined column, but surely there must be a better way.
df = (df.reset_index()
.assign(yearconstr=lambda df: df.yearref - df.age)
.set_index(list(df.index.names) + ["yearconstr"]))
print(df)
# expected result:
value
age yearref yearconstr
10 2015 2005 1
3 2015 2012 2
2 2016 2014 3
For a concise and straight-forward approach, we can make use of
eval
to generate a new Series calculated from the existing MultiIndex. This is easy since it treats index levels just like columns: df.eval("yearref - age")
rename
the new Seriesset_index
to append the Series to df
using the append=True
argument.Putting everything together:
df.set_index(df.eval("yearref - age").rename("yearconstr"), append=True)
# result:
value
age yearref yearconstr
10 2015 2005 1
3 2015 2012 2
2 2016 2014 3