Say I have a Pandas multi-index data frame with 3 indices:
import pandas as pd
import numpy as np
arrays = [['UK', 'UK', 'US', 'FR'], ['Firm1', 'Firm1', 'Firm2', 'Firm1'], ['Andy', 'Peter', 'Peter', 'Andy']]
idx = pd.MultiIndex.from_arrays(arrays, names = ('Country', 'Firm', 'Responsible'))
df_3idx = pd.DataFrame(np.random.randn(4,3), index = idx)
df_3idx
0 1 2
Country Firm Responsible
UK Firm1 Andy 0.237655 2.049636 0.480805
Peter 1.135344 0.745616 -0.577377
US Firm2 Peter 0.034786 -0.278936 0.877142
FR Firm1 Andy 0.048224 1.763329 -1.597279
I have furthermore another pd.dataframe consisting of unique combinations of multi-index-level 1 and 2 from the above data:
arrays = [['UK', 'US', 'FR'], ['Firm1', 'Firm2', 'Firm1']]
idx = pd.MultiIndex.from_arrays(arrays, names = ('Country', 'Firm'))
df_2idx = pd.DataFrame(np.random.randn(3,1), index = idx)
df_2idx
0
Country Firm
UK Firm1 -0.103828
US Firm2 0.096192
FR Firm1 -0.686631
I want to subtract the values from df_3idx
by the corresponding value in df_2idx
, so, for instance, I want to subtract from every value of the first two rows the value -0.103828, as index 1 and 2 from both dataframes match.
Does anybody know how to do this? I figured I could simply unstack the first dataframe and then subtract, but I am getting an error message.
df_3idx.unstack('Responsible').sub(df_2idx, axis=0)
ValueError: cannot join with no overlapping index names
Unstacking might anyway not be a preferable solution as my data is very big and unstacking might take a lot of time.
I would appreciate any help. Many thanks in advance!
related question but not focused on MultiIndex
However, the answer doesn't really care. The sub
method will align on the matching index levels.
pd.DataFrame.sub
with parameter axis=0
df_3idx.sub(df_2idx[0], axis=0)
0 1 2
Country Firm Responsible
FR Firm1 Andy 0.027800 3.316148 0.804833
UK Firm1 Andy -2.009797 -1.830799 -0.417737
Peter -1.174544 0.644006 -1.150073
US Firm2 Peter -2.211121 -3.825443 -4.391965