Pandas speedup apply on max()

I'd like to know how I may speed up the following function, e.g. with Cython?

def groupby_maxtarget(df, group, target):
    df_grouped = df.groupby([group]).apply(lambda row: row[row[target]==row[target].max()])
    return df_grouped

This function groups by a single column and returns all rows where each group's target achieves its max value; the resulting dataframe is returned.

If df has, say, 500K rows, the above function takes about 5 minutes to complete on my computer. This performance is OK, but I have data with over 10 million rows... Of course, one solution would be to execute the above as a query within a SQL server and have Python retrieve the result, but I was hoping for an SQL-free, Pythonic solution.

Solution

In [22]: pd.set_option('max_rows',20)

In [33]: N = 10000000

In [34]: df = DataFrame({'A' : np.random.randint(0,100,size=N), 'B' : np.random.randint(0,100,size=N)})

In [35]: df[df.groupby('A')['B'].transform('max') == df['B']]
Out[35]: 
          A   B
161      30  99
178      53  99
264      58  99
337      96  99
411      44  99
428      85  99
500      84  99
598      98  99
602      24  99
684      31  99
...      ..  ..
9999412  25  99
9999482  35  99
9999502   6  99
9999537  24  99
9999579  65  99
9999680  32  99
9999713  74  99
9999886  90  99
9999887  57  99
9999991  45  99

[100039 rows x 2 columns]

In [36]: %timeit df[df.groupby('A')['B'].transform('max') == df['B']]
1 loops, best of 3: 1.85 s per loop

Note that this is proportional to the number of groups, but the coefficient is pretty small. Eg. I do 100x groups and speed only 2x. Transform is quite efficient as it broadcasts.

In [8]: G = 100

In [9]: df = DataFrame({'A' : np.random.randint(0,G,size=N), 'B' : np.random.randint(0,G,size=N)})

In [10]: %timeit df[df.groupby('A')['B'].transform('max') == df['B']]
1 loops, best of 3: 1.86 s per loop

In [11]: G = 10000

In [12]: df = DataFrame({'A' : np.random.randint(0,G,size=N), 'B' : np.random.randint(0,G,size=N)})

In [13]: %timeit df[df.groupby('A')['B'].transform('max') == df['B']]
1 loops, best of 3: 3.95 s per loop