Search code examples
pythonpandasmulti-index

How to select second level in multiindex when using columns?


I have a dataframe with this index:

index = pd.MultiIndex.from_product([['stock1','stock2'...],['price','volume'...]])

It's a useful structure for being able to do df['stock1'], but how do I select all the price data? I can't make any sense of the documentation.

I've tried the following with no luck: df[:,'price'] df[:]['price'] df.loc(axis=1)[:,'close'] df['price]

If this index style is generally agreed to be a bad idea for whatever reason, then what would be a better choice? Should I go for a multi-indexed index for the stocks as labels on the time series instead of at the column level?

EDIT - I am using the multiindex for the columns, not the index (the wording got the better of me). The examples in the documentation focus on multi-level indexes rather than column structures.


Solution

  • Also using John's data sample:

    Using xs() is another way to slice a MultiIndex:

    df
                   0
    stock1 price   1
           volume  2
    stock2 price   3
           volume  4
    stock3 price   5
           volume  6
    
    df.xs('price', level=1, drop_level=False)
                  0
    stock1 price  1
    stock2 price  3
    stock3 price  5
    

    Alternatively if you have a MultiIndex in place of columns:

    df
      stock1        stock2        stock3       
       price volume  price volume  price volume
    0      1      2      3      4      5      6
    
    df.xs('price', axis=1, level=1, drop_level=False)
      stock1 stock2 stock3
       price  price  price
    0      1      3      5