Given a multilevel row as shown in current presentation, I would like to rename the first level index based on information stored in a look up dictionary.
Presently, I have the idea of transpose the df
and looping to the now column
name. Thereafter, the suitable new columns name will be checked and rename based on the information stored in the dictionary (i.e.,str_dic).
However, I wonder whether there is more straight forward of doing this?
The drafted code is as below
import pandas as pd
def create_df (idx):
df = pd.DataFrame ( {'A': [11, 21, 31],
'B': [12, 22, 32],
'C': [13, 23, 33]},
index=['ONE', 'TWO', 'THREE'] )
df.columns = pd.MultiIndex.from_product ( [['level1'], ['level2'], df.columns] )
df = df.set_index ( [[f'idx_{idx}'] * len ( df )], append=True ).swaplevel ( 0 )
# df = df.set_index ( [['temp_general'] * len ( df )], append=True ).swaplevel ( 0 )
return df
#look up dict
str_g = ['idx_0', 'idx_1', 'idx_2', 'idx_3','idx_4']
str_h = ['E', 'b', 'c', 'd','e']
str_dic = {str_g [i]: str_h [i] for i in range ( len ( str_g ) )}
# create the df
all_df = [create_df ( idx ) for idx in range ( 0, len(str_h)-1)] # Delibarately minus one for worse case scenario
df = pd.concat ( all_df, axis=0 )
df=df.T
all_ls=list(df.columns.values.tolist())
for xss in all_ls:
df=df.rename ( columns={xss: str_dic[xss[0]]}, level=1) #WIP
Current presentation
level1
level2
A B C
idx_0 ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
idx_1 ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
idx_2 ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
idx_3 ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
Expected Output
level1
level2
A B C
a ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
b ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
c ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
d ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
#######################
Bonus: Based on suggestion by Anurag Dabas, how can we append another level on the highest level based on another look up dict.
Naively, I would proposed something like
df.index = pd.MultiIndex.from_product ( [s1, df.index.get_level_values ( 0 ).unique (),
df.index.get_level_values ( 1 ).unique ()] )
But it return an error
ValueError: Length mismatch: Expected axis has 12 elements, new values have 48 elements
The failed proposed amended based on Anurag:
str_global=['typ1','typ1','typ2','typ2','typ3']
global_dic = {str_h [i]: str_global [i] for i in range ( len ( str_h ) )}
s1 = [global_dic.get ( x ) for x in df.index.get_level_values ( 0 ).unique ()]
df.index = pd.MultiIndex.from_product ( [s1, df.index.get_level_values ( 0 ).unique (),
df.index.get_level_values ( 1 ).unique ()] )
Expected output
A B C
typ1 E ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
typ1 b ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
typ2 c ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
typ2 d ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
let's try replacing keys of str_dic in 'level 0' index with their values by using list comprehension then generate a MultiIndex by using pd.MultiIndex.from_product()
and set that equal to the index of your dataframe by using index
attribute::
s=[str_dic.get(x) for x in df.index.get_level_values(0).unique()]
df.index=pd.MultiIndex.from_product([s,df.index.get_level_values(1).unique()])
Update:
Since Now you have 4 unique level0 value and 3unique level0 index and 4 value in s1 so pd.MultiIndex.from_product()
create 48 pairs of MultiIndex so it is not useful in that case so use:
s1=[global_dic.get(x) for x in df.index.get_level_values(0)]
df=df.set_index(pd.Series(s1).values,append=True)
df.index=df.index.reorder_levels([2,0,1])
output of df
:
level1
level2
A B C
E ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
b ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
c ONE 11 12 13
TWO 21 22 23
THREE 31 32 33
d ONE 11 12 13
TWO 21 22 23
THREE 31 32 33