I have a set of data with multiple experiences. Two for egg
, two for spam
:
>>> df = pd.read_csv(StringIO("""
name,egg1,egg2,spam1,spam2
foo,4,8,15,16
bar,23,42,66,83
"""))
>>> df.set_index('name')
egg1 egg2 spam1 spam2
name
foo 4 8 15 16
bar 23 42 66 83
What I would like to get is something like this :
egg spam
1 2 1 2
name
foo 4 8 15 16
bar 23 42 66 83
I read about Multiindex and indexslice, but I haven't found how to use them properly.
Any advice?
You can split the columns, which convert to a MultiIndex, and drop the empty level:
(df.set_index('name')
.pipe(lambda df: df.set_axis(df.columns
.str
.split(r"(\d)", expand = True)
.droplevel(-1),
axis = 1,
)
)
)
egg spam
1 2 1 2
name
foo 4 8 15 16
bar 23 42 66 83
Of course, a cleaner way would be to do it in steps:
Set name
as index:
df = df.set_index('name')
Create MultiIndex via str.split
, setting expand = True
:
df.columns = df.columns.str.split(r"(\d)", expand = True).droplevel(-1)
df
egg spam
1 2 1 2
name
foo 4 8 15 16
bar 23 42 66 83