Search code examples
pythonpandasdataframemulti-index

Query Multiindex in Pandas Dataframe - Special indexing


I have a specific query for a multi-indexed dataframe which I cannot get my head around how I can achieve it. The various explanations on different websites do unfortunately not help with this. Working example:

col1 = ['30/01/2021','30/01/2021','31/01/2021','31/01/2021','01/02/2021','01/02/2021']
col2 = ['USD','EUR','USD','EUR','USD','EUR']
col3 = ['3M','1M','1M','3M','3M','6M']
col4 = [0.9,0.95,0.89,0.91,0.94,0.92]

df = pd.DataFrame(index = [col1, col2, col3], data = col4, columns = ['Value'])

I can now apply "simple" indexing like:

df.loc[:,['USD','EUR'],:]

which gets me all entries for the currencies USD and EUR:

                   Value
30/01/2021 USD 3M   0.90
31/01/2021 USD 1M   0.89
01/02/2021 USD 3M   0.94
30/01/2021 EUR 1M   0.95
31/01/2021 EUR 3M   0.91
01/02/2021 EUR 6M   0.92

or

df.loc[:,['USD','EUR'],['1M','3M']]

which gets me all entries for the currencies USD and EUR that are 1M or 3M:

                   Value
31/01/2021 USD 1M   0.89
30/01/2021 USD 3M   0.90
01/02/2021 USD 3M   0.94
30/01/2021 EUR 1M   0.95
31/01/2021 EUR 3M   0.91

However, what I actually want to have is all entries with (currency USD and time 1M) AND all entries with (currency EUR and time 3M).

That is, I desire to have as result (that the date is the same is a coincidence of this example. So I cannot simply look for the date)

                   Value
31/01/2021 USD 1M   0.89                   
31/01/2021 EUR 3M   0.91

How can I achieve that? I tried various versions of tuples and lists combinations, but as said, couldn't figure it out.

Thank you for any guidance.


Solution

  • You can create MultiIndex by both lists by MultiIndex.from_arrays and then match to second and third level by removing first level of original MultiIndex by MultiIndex.droplevel, test by Index.isin and filter in boolean indexing:

    a = ['USD','EUR']
    b = ['1M','3M']
    
    mux = pd.MultiIndex.from_arrays([a, b])
    
    df = df[df.index.droplevel(0).isin(mux)]
    print (df)
                           Value
    31/01/2021 USD 1M   0.89
               EUR 3M   0.91