I want to slice a data frame using a partially matching index, or list of tuples.
_ix = [('foo','a', 1), ('foo','a', 2), ('foo','b', 1),
('foo','b', 2), ('foo','c', 1), ('foo','c', 2)]
df = pd.DataFrame(np.ones((6, 1)), index=pd.MultiIndex.from_tuples(_ix))
print(df)
0
foo a 1 1.0
2 1.0
b 1 1.0
2 1.0
c 1 1.0
2 1.0
Given a query index like:
q_ix = [('foo', 'a'), ('foo', 'c')]
I want to obtain
0
foo a 1 1.0
2 1.0
c 1 1.0
2 1.0
I can get this by using pd.concat
and a list comprehension...
df_sliced = pd.concat([df.loc[(*x, slice(None)), :] for x in q_ix])
...but this is super clunky when my query index is large. Is there no better way?
Use pd.IndexSlice
from pandas
import pandas as pd
idx = pd.IndexSlice
df.loc[idx[:, ['a', 'c']], :] # Can use 'foo' instead of : on the first lvl
outputs
0
foo a 1 1.0
2 1.0
c 1 1.0
2 1.0
It reads take everything along the first level (:
) then on the second level grab ["a", "c"]
. Which we enclose in idx
to mark that it is a slice. Finally, the last :
tells that we want all the columns.