I have a dataframe with a multi-index an I want to select only those rows whose indexes are 1:1
arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux']),
np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one' ])]
df = pd.DataFrame(np.random.randn(7, 4), index=arrays)
I expect to select only
qux one -0.521636 -2.253847 -0.420486 1.354772
I believe you need remove duplicated rows by first level of MultiIndex
with Index.get_level_values
, Index.duplicated
and boolean indexing
:
df = df[~df.index.get_level_values(0).duplicated(keep=False)]
print (df)
0 1 2 3
qux one -1.031044 -0.838885 0.316934 0.353254
Or use GroupBy.transform
with any column - e.g. first (or you can specify it) for counts per first level:
df = df[df.groupby(level=0)[df.columns[0]].transform('size').eq(1)]