Here is my dataframe :
df = pd.DataFrame(pd.DataFrame({"C1" : [0.5, 0.9, 0.1, 0.2, 0.3, 0.5, 0.2],
"C2" : [200, 158, 698, 666, 325, 224, 584],
"C3" : [15, 99, 36, 14, 55, 62, 37]},
index = pd.MultiIndex.from_tuples([(0,0), (1,0), (1,1), (2,0), (2,1), (3,0), (4,0)],
names=['L1','L2'])))
df :
C1 C2 C3
L1 L2
0 0 0.5 200 15
1 0 0.9 158 99
1 0.1 698 36
2 0 0.2 666 14
1 0.3 325 55
3 0 0.5 224 62
4 0 0.2 584 37
I would like to keep the rows that only have one value in L1 subindex (0 in that case) in order to get something like that :
C1 C2 C3
L1 L2
0 0 0.5 200 15
3 0 0.5 224 62
4 0 0.2 584 37
Please, could you let me know if you have any clue to solve this problem ?
Sincerely
Use GroupBy.transform
by first level with any column with GroupBy.size
and compare by Series.eq
and filter by boolean indexing
:
df1 = df[df.groupby(level=0)['C1'].transform('size').eq(1)]
Or extract index of first level by Index.get_level_values
and filter with inverted mask by ~
with Index.duplicated
and keep=False
for all dupes:
df1 = df[~df.index.get_level_values(0).duplicated(keep=False)]