I have N dataframes, in this case lets use 2 dfs as an example:
df1 = pd.DataFrame([['a', 2], ['b', 4]], columns=['foo', 'bar'])
df2 = pd.DataFrame([['a', 3], ['b', 5]], columns=['foo', 'bar'])
Which produce:
foo bar
0 a 2
1 b 4
foo bar
0 a 3
1 b 5
How can I concat or merge them into a multi-index, where the new column level's name is based on some external variable attached to the dfs, Eg I will use the df name as an example here:
df1 df2
foo bar bar
0 a 2 3
1 b 4 5
The dataframes are guaranteed to have the same foo
values in the same order.
You can use a custom concat
:
dfs = [df1, df2]
out = (pd.concat({f'df{i}': d.set_index('foo')
for i, d in enumerate(dfs, start=1)}, axis=1)
.reset_index()
)
Output:
foo df1 df2
bar bar
0 a 2 3
1 b 4 5
If really you need the "foo" at the bottom:
out = (pd.concat({f'df{i}': d.set_index('foo')
for i, d in enumerate(dfs, start=1)}, axis=1)
.swaplevel(axis=1)
.reset_index().swaplevel(axis=1)
)
Output:
df1 df2
foo bar bar
0 a 2 3
1 b 4 5