python pandas dataframe group-by filtering

How to apply multilevel groupby and filter top N records within each subgroup?

Input:
Input

I have an input DataFrame as shown above. The products in each category need to be grouped by the sum of their sales. From this we select the top N products(in this example,N=2) in each category and only those products from the respective categories should be displayed in the final output/dataFrame as shown below

Output:
Output

I could try creating loops and iterating. I was hoping there would be an elegant way to do this.

pd.DataFrame({'Date':['07-08-2023','06-08-2023','08-08-2023','08-08-2023','07-08-2023','06-08-2023','06-08-2023','08-08-2023'],
             'Category':['A','A','A','A','B','B','B','B'],
             'Products':['abc','xyz','abc','def','nop','hjk','qrs','qrs'],
             'Sales':[1,2,4,3,4,5,2,3]})

Solution

Using a groupby to get the top N products, then a merge:

N = 2
cols = ['Category', 'Products']

out = df.merge(df
   .groupby(cols, as_index=False)['Sales'].sum()
   .sort_values(by='Sales', ascending=False)
   .groupby('Category', as_index=False)
   [cols].head(N)
 )

Output:

         Date Category Products  Sales
0  07-08-2023        A      abc      1
1  08-08-2023        A      abc      4
2  08-08-2023        A      def      3
3  06-08-2023        B      hjk      5
4  06-08-2023        B      qrs      2
5  08-08-2023        B      qrs      3