Search code examples
pythonpandaslambdapandas-groupby

How to calculate vwap (volume weighted average price) using groupby and apply?


I have read multiple post similar to my question, but I still can't figure it out. I have a pandas df that looks like the following (for multiple days):

Out[1]: 
                     price  quantity
time                                
2016-06-08 09:00:22  32.30    1960.0
2016-06-08 09:00:22  32.30     142.0
2016-06-08 09:00:22  32.30    3857.0
2016-06-08 09:00:22  32.30    1000.0
2016-06-08 09:00:22  32.35     991.0
2016-06-08 09:00:22  32.30     447.0
...

To calculate the vwap I could do:

df['vwap'] = (np.cumsum(df.quantity * df.price) / np.cumsum(df.quantity))

However, I would like to start over every day (groupby), but I can't figure out how to make it work with a (lambda?) function.

df['vwap_day'] = df.groupby(df.index.date)['vwap'].apply(lambda ...

Speed is of essence. Would appreciate any help:)


Solution

  • Option 0
    plain vanilla approach

    def vwap(df):
        q = df.quantity.values
        p = df.price.values
        return df.assign(vwap=(p * q).cumsum() / q.cumsum())
    
    df = df.groupby(df.index.date, group_keys=False).apply(vwap)
    df
    
                         price  quantity       vwap
    time                                           
    2016-06-08 09:00:22  32.30    1960.0  32.300000
    2016-06-08 09:00:22  32.30     142.0  32.300000
    2016-06-08 09:00:22  32.30    3857.0  32.300000
    2016-06-08 09:00:22  32.30    1000.0  32.300000
    2016-06-08 09:00:22  32.35     991.0  32.306233
    2016-06-08 09:00:22  32.30     447.0  32.305901
    

    Option 1
    Throwing in a little eval

    df = df.assign(
        vwap=df.eval(
            'wgtd = price * quantity', inplace=False
        ).groupby(df.index.date).cumsum().eval('wgtd / quantity')
    )
    df
    
                         price  quantity       vwap
    time                                           
    2016-06-08 09:00:22  32.30    1960.0  32.300000
    2016-06-08 09:00:22  32.30     142.0  32.300000
    2016-06-08 09:00:22  32.30    3857.0  32.300000
    2016-06-08 09:00:22  32.30    1000.0  32.300000
    2016-06-08 09:00:22  32.35     991.0  32.306233
    2016-06-08 09:00:22  32.30     447.0  32.305901