Search code examples
pythonpandasdataframepandas-groupbymode

Generating new column containing modal value


I have the following data frame:

skus = [1, 1, 1, 2, 2]
prices = [10.99, 10.99, 11.50, 9.99, 9.99]
data = dict(skus=skus, prices=prices)
df = pd.DataFrame(data, columns=data.keys())

How do I create another column prices_mode that looks like this:

   skus  prices  prices_modal
0     1   10.99         10.99
1     1   10.99         10.99
2     1   11.50         10.99
3     2    9.99          9.99
4     2    9.99          9.99

i.e. it's the mode of the prices for each unique sku value.


Solution

  • Method 1

    Using groupby, transform and pd.Series.mode:

    df['prices_modal'] = df.groupby('skus')['prices'].transform(lambda x: pd.Series.mode(x)[0])
    
       skus  prices  prices_modal
    0     1   10.99         10.99
    1     1   10.99         10.99
    2     1   11.50         10.99
    3     2    9.99          9.99
    4     2    9.99          9.99
    

    Method 2

    Using statistics.mode

    from statistics import mode
    
    df['prices_modal'] = df.groupby('skus')['prices'].transform(mode)
    
       skus  prices  prices_modal
    0     1   10.99         10.99
    1     1   10.99         10.99
    2     1   11.50         10.99
    3     2    9.99          9.99
    4     2    9.99          9.99