Search code examples
pythonpandaspandas-groupby

Filter grouped pandas dataframe, keep all rows with minimum value in column


df = pd.DataFrame([['SAM', 23, 1],
                   ['SAM', 23, 2],
                   ['SAM', 23, 1],
                   ['SAM', 23, 3],
                   ['BILL', 36, 1],
                   ['BILL', 36, 2],
                   ['BILL', 36, 3],
                   ['BILL', 36, 1],
                   ['JIMMY', 33, 4],
                   ['JIMMY', 33, 2],
                   ['JIMMY', 33, 2],
                   ['JIMMY', 33, 3],
                   ['CARTER', 25, 3],
                   ['CARTER', 25, 4],
                   ['CARTER', 25, 5],
                   ['CARTER', 25, 4],
                   ['GRACE', 27, 4],
                   ['GRACE', 27, 5],
                   ['GRACE', 27, 6],
                   ['TOMMY', 32, 7]])
df.columns = ['A', 'B', 'C']

I need to keep in dataframe all rows with minimum values of 'C' column grouped by 'A' column and remain B the same. There is almost same theme here but if i use

df.loc[df.groupby('A').C.idxmin()]

Only one minimum row remains, and i need all of them. Expected result:

image of expected result


Solution

  • Let's try with groupby.transform to get the minimum value of C per group and compare with df['C'] and keep those C values that equal the minimum:

    df.loc[df.groupby('A')['C'].transform('min').eq(df['C'])].reset_index(drop=True)
    
            A   B  C
    0     SAM  23  1
    1     SAM  23  1
    2    BILL  36  1
    3    BILL  36  1
    4   JIMMY  33  2
    5   JIMMY  33  2
    6  CARTER  25  3
    7   GRACE  27  4
    8   TOMMY  32  7