dataframe filter groupby based on a subset

df_example = pd.DataFrame({'name': ['a', 'a', 'a', 'b', 'b', 'b'],
                          'class': [1, 2, 2, 3, 2, 2],
                          'price': [3, 4, 2, 1, 6, 5]})

I want to filter each name where the price is larger than the smallest price in a subset class==2 within name grounp:

df_example.sort_values(['name', 'price'], inplace=True)
df_tem = df_example[df_example['class'] == 2].groupby('name').first()

Below is the pseudocode:

df_example.groupby('name').apply(lambda key, val: val['price'] > df_tem.loc[key]['price']).reset_index()

Is there any effective way to achieve something like filter dataframe based on a subset within groupby

result:

the smallest price with class=2 for each name group df_tem:

    class   price
name        
a   2   2
b   2   5

Therefore,

group a: price>2; group b: price>5

the output:

pd.DataFrame({'name': ['a', 'a', 'b'],
                          'class': [1, 2, 2],
                          'price': [3, 4, 6]})

Update:

actually i have an idea that create a new column called smallest, then filter

df_example by df_example['price'] > df_example['smallest '].

Do you know how to quickly create such column something like

df_example['smallest '] = df_example[df_example['class'] == 2].groupby('name')['price'].transform('first')

above way still have nan

Solution

Code

Use groupby to aggregate the minimum and use map to map the result to name column for boolean indexing.

m = df_example[df_example['class'] == 2].groupby('name')['price'].min()
out = df_example[df_example['price'] > df_example['name'].map(m)]

out

  name  class  price
0    a      1      3
1    a      2      4
4    b      2      6

update additional question

Additionally, if you want to include name for which class=2 does not exist, use the code below

m = df_example[df_example['class'] == 2].groupby('name')['price'].min()
cond1 = df_example['price'] > df_example['name'].map(m)
cond2 = ~df_example['name'].isin(m.index)
out = df_example[cond1 | cond2]

new example

df_example = pd.DataFrame({'name': ['a', 'a', 'a', 'b', 'b', 'b', 'c'],
                          'class': [1, 2, 2, 3, 2, 2, 3],
                          'price': [3, 4, 2, 1, 6, 5, 5]})

out:

  name  class  price
0    a      1      3
1    a      2      4
4    b      2      6
6    c      3      5