I have a dataset that for simplicity I need to group by and aggregate based on one column so that I can remove some rows easily. Once I am done with the calculations, I need to reverse the group by actions so that I can see the dataframe easily in excel. If I do not inverse the action, I would export the whole list to excel which is not easy to analyse. Any help is gretaly appreciated.
Example:
Col1 Col2 Col3
123 11 Yes
123 22 Yes
256 33 Yes
256 33 No
337 00 No
337 44 No
After applying groupby and aggregate:
X=dataset.groupby('Col1').agg(lambda x:set(x)).reset_index()
I get
Col1 Col2 Col3
123 {11,22} {Yes}
256 {33} {Yes, No}
337 {00,44} {No}
I then remove all the columns that contain Yes using drop
X=X.reset_index(drop=True)
what I need to get before exporting to excel is
Col1 Col2 Col3
337 00 No
337 44 No
Hope this is clear enough
Thaks in advance
I don't believe converting to a set is a good idea. Here's an alternative: First sort in descending order by Col3
, then create a mapping of Col2 : Yes/No
and filter based on that.
In [1191]: df = df.sort_values('Col3', ascending=True)
In [1192]: mapping = dict(df[['Col2', 'Col3']].values)
In [1193]: df[df.Col2.replace(mapping) == 'No'] # or df.Col2.map(mapping)
Out[1193]:
Col1 Col2 Col3
4 337 0 No
5 337 44 No