I have a dataframe with columns and rows that have names as below index:
idx = ['CAN_agr', 'CAN_ser', 'USA_agr', 'USA_ser', 'MEX_agr', 'MEX_ser']
sample = pd.DataFrame(
[[1, 2, 3, 4, 5 ,6],
[7, 8, 9, 10, 11, 12],
[13, 14, 15, 16, 17, 18],
[19, 20, 21, 22, 23, 24],
[25, 26, 27, 28, 29, 30],
[31, 32, 33, 34, 35, 36]],
index = idx,
columns=idx
)
I wanted to use this names to create multi-index where level=0 will be name of countries (i.e. CAN, USA, MEX) and level=1 will have the industry type(i.e. agr, ser) for each country. I thought of using str.extract, but I was not sure how I could separate the labels according to countries and industry types and assign them as the multi-index columns and rows. Could anyone help me how I should approach this issue? Thank you very much!
split
index and columns on _
and then assign them back as index and columns should do:
sample.index = sample.index.str.split('_', expand=True)
sample.columns = sample.columns.str.split('_', expand=True)
sample
CAN USA MEX
agr ser agr ser agr ser
CAN agr 1 2 3 4 5 6
ser 7 8 9 10 11 12
USA agr 13 14 15 16 17 18
ser 19 20 21 22 23 24
MEX agr 25 26 27 28 29 30
ser 31 32 33 34 35 36