I have MultiIndex dataframe (table1) and I want to merge specific columns from another dataframe that is not multiIndex (table 2).
Example of table 1:
>>> name 2020-10-21 2020-10-22 ...
Column 9 10 11 12 9 10 11 12
0 A5 2.1 2.2 2.4 2.8 5.4 3.4 1.1 7.3
1 B9 7.2 1.2 14.5 7.5 3.4 5.2 6.4 8.1
2 C3 1.1 6.5 8.4 9.1 1.1 4.3 6.5 8.7
...
Example of table 2:
>>>name indc control code
0 A5 0.32 yes 1
1 C3 0.11 no 2
2 B18 0.23 yes 2
3 B9 0.45 no 3
I want to merge the column "code" based on key "name" from table 2 (and "index" from table 1) to get the code beside te name:
>>> index 2020-10-21 2020-10-22 ...
Column code 9 10 11 12 9 10 11 12
0 A5 1 2.1 2.2 2.4 2.8 5.4 3.4 1.1 7.3
1 B9 3 7.2 1.2 14.5 7.5 3.4 5.2 6.4 8.1
2 C3 2 1.1 6.5 8.4 9.1 1.1 4.3 6.5 8.7
...
I know how to merge when the index is not multindex level, then I do so something like this:
df = table1.merge(table2[['code','name']], how = 'left',
left_on = 'index', right_on = 'name')
but now I get error:
UserWarning: merging between different levels can give an unintended result (2 levels on the left,1 on the right) warnings.warn(msg, UserWarning)
and then:
ValueError: 'index' is not in list
when I print the columns I can see that thy are like tuples but I don't know why it says the index is not in list as when I print the oclumns of table 1 I get:
Index([ ('index', ''), (2020-10-22, 9)...
so i'm a bit confused.
My end goal: to merge the code column based on the columns "name" and "index"
For correct working need MultiIndex
in both DataFrame
s:
df2 = table2[['code','name']].rename(columns={'name':'index'})
df2.columns = pd.MultiIndex.from_product([df2.columns, ['']])
df = table1.merge(df2, how = 'left', on = [('index', '')])
#if necessary reorder columns names
cols = df.columns[:1].tolist() + df.columns[-1:].tolist() + df.columns[1:-1].tolist()
df = df[cols]
print (df)
index code 2020-10-21 2020-10-22
9 10 11 12 9 10 11 12
0 A5 1 2.1 2.2 2.4 2.8 5.4 3.4 1.1 7.3
1 B9 3 7.2 1.2 14.5 7.5 3.4 5.2 6.4 8.1
2 C3 2 1.1 6.5 8.4 9.1 1.1 4.3 6.5 8.7