I have a dataframe with this index:
index = pd.MultiIndex.from_product([['stock1','stock2'...],['price','volume'...]])
It's a useful structure for being able to do df['stock1']
, but how do I select all the price data? I can't make any sense of the documentation.
I've tried the following with no luck: df[:,'price']
df[:]['price']
df.loc(axis=1)[:,'close']
df['price]
If this index style is generally agreed to be a bad idea for whatever reason, then what would be a better choice? Should I go for a multi-indexed index for the stocks as labels on the time series instead of at the column level?
EDIT - I am using the multiindex for the columns, not the index (the wording got the better of me). The examples in the documentation focus on multi-level indexes rather than column structures.
Also using John's data sample:
Using xs()
is another way to slice a MultiIndex
:
df
0
stock1 price 1
volume 2
stock2 price 3
volume 4
stock3 price 5
volume 6
df.xs('price', level=1, drop_level=False)
0
stock1 price 1
stock2 price 3
stock3 price 5
Alternatively if you have a MultiIndex
in place of columns:
df
stock1 stock2 stock3
price volume price volume price volume
0 1 2 3 4 5 6
df.xs('price', axis=1, level=1, drop_level=False)
stock1 stock2 stock3
price price price
0 1 3 5