Search code examples
pythonpandasfunctiondataframemulti-index

Multiindex "get_level_values"-function for arbitrarily many levels


Is there a way to construct a function that uses "get_level_values" an arbitrarily number of times and returns the sliced dataframe? An example can explain my need.

Multiindex:

arrays = [['bar', 'bar', 'bar', 'baz', 'baz', 'foo', 'foo','foo','qux', 'qux'],
          ['one', 'two', 'three', 'one', 'four', 'one', 'two', 'eight','one', 'two'],
          ['green', 'green', 'blue', 'blue', 'black', 'black', 'orange', 'green','blue', 'black']  ]
s = pd.DataFrame(np.random.randn(10), index=arrays)
s.index.names = ['p1','p2','p3']

s
                         0
p1  p2    p3              
bar one   green  -0.676472
    two   green  -0.030377
    three blue   -0.957517
baz one   blue    0.710764
    four  black   0.404377
foo one   black  -0.286358
    two   orange -1.620832
    eight green   0.316170
qux one   blue   -0.433310
    two   black   1.127754

Now, this is is the function I want to create:

def my_func(df,levels, values):
    # Code using get_level_values
    return ret

# Example use
my_func(s, ['p1'],['bar'])

p1  p2    p3              
bar one   green  -0.676472
    two   green  -0.030377
    three blue   -0.957517

my_func(s, ['p1','p2'],['bar','one'])

p1  p2    p3              
bar one   green  -0.676472

Above my_func(['p1'],['bar']) returns s.loc[s.index.get_level_values('p1')=='bar'] and my_func(['p1','p2'],['bar','one']) returns s.loc[(s.index.get_level_values('p1')=='bar') & (s.index.get_level_values('p2')=='one')]

So, I want to put a list of arbitrarily many levels and a list of the same number of values and return the sliced dataframe.

Any help is much appreciated!


Solution

  • Try this and see if it works for you : since ur multiindex has names, it is easier using query for your function :

    def my_func(df,levels, values):
        # Code using query
        m = dict(zip(levels,values))
        #create expression to use in the query method
        expr = " and ".join(f"{k}=={v!r}" for k,v in m.items())
        ret = df.query(expr)
        return ret
    
    
    #function application
    my_func(s, ['p1'],['bar'])
    
                        0
    p1  p2  p3  
    bar one green   -0.087366
        two green   1.126620
      three blue    0.868515
    
    
    my_func(s, ['p1','p2'],['bar','one'])
    
                        0
    p1  p2  p3  
    bar one green   -0.087366