I need the cells excluded from summation when the top level of the multiindex (country1 and Country2) are the same in the row and columns
import pandas as pd
label_columns1 = ['Country1','Country2']
label_columns2 = ['agri1', 'agri2']
label_rows1 = ['Country1','Country2']
label_rows2 = ['agri1', 'agri2']
df = pd.DataFrame([[1,2,3,4]], index=pd.MultiIndex.from_product([label_rows1,label_rows2]),
columns=pd.MultiIndex.from_product([label_columns1, label_columns2 ]))
df
Country1 Country2
agri1 agri2 agri1 agri2
Country1 agri1 1 2 3 4
agri2 1 2 3 4
Country2 agri1 1 2 3 4
agri2 1 2 3 4
I know this will give me the row-wise sum and a new column
df[('Total',"Vlaue")] = df.sum(axis=1)
df
Country1 Country2 Total
agri1 agri2 agri1 agri2 Value
Country1 agri1 1 2 3 4 10
agri2 1 2 3 4 10
Country2 agri1 1 2 3 4 10
agri2 1 2 3 4 10
What I really want and have been trying hard was only to add the cells if the column and row index is not the same. Which in this case for example the first row will be 7. my data frame is big and too much rows and columns. What i have tried so far
for col in df.columns:
for row in df.index:
if col != row:
df[('Total',"Vlaue")] = df.sum(axis=1)
which gives me unintended results but I guess i am close My expected result should look like
Country1 Country2 Total
agri1 agri2 agri1 agri2 Value
Country1 agri1 1 2 3 4 (3+4) = 7
agri2 1 2 3 4 (3+4) = 7
Country2 agri1 1 2 3 4 (1+2) = 3
agri2 1 2 3 4 (1+2) = 3
Mask the values where index is same as the column label, then calculate sum along axis=1
ix = df.index.get_level_values(0)
cx = df.columns.get_level_values(0)
m = ix.values[:, None] == cx.values
df[('Total', 'Value')] = df.mask(m).sum(axis=1)
Country1 Country2 Total
agri1 agri2 agri1 agri2 Value
Country1 agri1 1 2 3 4 7.0
agri2 1 2 3 4 7.0
Country2 agri1 1 2 3 4 3.0
agri2 1 2 3 4 3.0