I wrote this code
import pandas as pd
import numpy as np
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
df_city = pd.DataFrame({'LS': [65.29, 67.40,66.55,65.98,67.09,67.80], 'Traditionnelle': [50.94, 62.69,56.60,61.60,53.97,53.12], 'Region': ['Est', 'Nord','Ouest','Paris centre','Sud Est','Sud Ouest']})
df_contact=pd.DataFrame({'LS': [62.64, 64.32,64.72,69.39,66.10,64.39], 'Traditionnelle': [57.45, 60.07,57.73,62.54,57.98,56.20], 'Region': ['Est', 'Nord','Ouest','Paris centre','Sud Est','Sud Ouest']})
df_express=pd.DataFrame({'LS': [62.43,65.06,64.36,66.30,64.53,63.87], 'Traditionnelle': [54.84,54.39,52.29,45.62,53.78,51.89], 'Region': ['Est', 'Nord','Ouest','Paris centre','Sud Est','Sud Ouest']})
df_huit=pd.DataFrame({'LS': [52.47,58.98,54.47,62.44,50.33,50.51], 'Traditionnelle': [37.36,43.47,43.94,43.27,46.16,45.62], 'Region': ['Est', 'Nord','Ouest','Paris centre','Sud Est','Sud Ouest']})
df_multi = pd.concat([df_city.set_index('Region'), df_contact.set_index('Region'),df_express.set_index('Region'),df_huit.set_index('Region')], axis=1, keys=['city', 'contact','express','huit'])
df_test = pd.DataFrame({'Region': ["Est","Est","Sud Est","Sud Ouest"], 'Enseigne': ["huit","contact","contact","express"], 'boucherie': ['LS', 'Traditionnelle','LS','LS']})
dfResult=df_multi.loc[df_test["Region"],(df_test['Enseigne'],df_test['boucherie'])]
print(df_test)
df_test["taux"]=np.nan
for i in range(df_test.shape[0]):
df_test["taux"][i]=df_multi.loc[df_test["Region"][i],(df_test['Enseigne'][i],df_test['boucherie'][i])]
print(df_test)
I was wondering if there were another way to put the values of dftest as loc into my dataframe df_multi without having to use a loop?
I tried to use this but it doesn't give the same result as it returns a dataframe:
df_test["taux"]=df_multi.loc[df_test["Region"].values,df_test['Enseigne'].values,df_test['boucherie'].values)]
And also
dfResult=df_multi.loc[zip(df_test["Region"].values,(df_test['Enseigne'].values,df_test['boucherie'].values))]
But it didn't worked out
Simply:
df_test["taux"] = df_test.apply(lambda x: df_multi.loc[x["Region"], (x["Enseigne"], x["boucherie"])], axis=1)
which gives
Region Enseigne boucherie taux
0 Est huit LS 52.47
1 Est contact Traditionnelle 57.45
2 Sud Est contact LS 66.10
3 Sud Ouest express LS 63.87
Et voilà!