Search code examples
pythonpandasgroup-by

Pandas group by with conditions


I have a df with some retailer specific prices , i need to group and calculate avg prices on some conditions and add column on same df

example df

source  | price     | stock
abc     | 10        | in stock
def     | 12        | in stock
def     | 12        | in stock
abc     | 10        | out of stock
qwe     | 15        | in stock
qwe     | 15        | out of stock
qwe     | 13        | in stock
qwe     | 13        | in stock

condition

  1. Considering abc as my base source i need to compare average price of abc with remaning onces
  2. Should not include Out of stock product for any calculations

required output

source  | abc price    | comp price | price index
def     | 10           | 12         | 0.83
qwe     | 10           | 13.66      | 0.73

Here comp price is average of all rows for source def and qwe in qwe we need to exclude out of stock and take avg of all in stock and populate in comp price

I have tried

.groupby(group_combination)['price'].transform('avg')

It didn't work

please help to get optimal solution


Solution

  • If I understood correctly, you can use:

    ref = 'abc'
    
    out = (df.query('stock != "out of stock"')
             .groupby('source', as_index=False)['price'].mean()
             .assign(**{f'{ref} price': lambda d: d.set_index('source').loc[ref, 'price']})
             .rename(columns={'price': 'comp price'})
             .assign(**{'price index': lambda d: d['comp price'].rdiv(d[f'{ref} price'])})
             .loc[lambda d: d['source'].ne(ref)]
          )
    
    print(out)
    

    Output:

       source  comp price  abc price  price index
    1     def   12.000000       10.0     0.833333
    2     qwe   13.666667       10.0     0.731707