Search code examples
pythonfunctionlambdaapply

groupby + apply + lambda fail to call my function


I wrote a function that should normalise values belonging to same week (dividing by the value of the first day):

def normalize(week):
        norm_week = (week / week[0]) -1
        return norm_week

I get the week data from a groupby call and I pass it to the normalise function through the apply method (with lambda):

dataset['col_1_norm'] = 
dataset.groupby('week_number')['col_1'].apply(lambda x: normalize(x)) 

This is input dataset:

week_number   col_1 
week_1.       300 
week_1        500
.....         ... 
week_2        350
.....         ...

I expect the normalised values in the column "col_1_norm", but python returns multiple errors. (example -> 3361 return self._engine.get_loc(casted_key))

Where am I wrong ?? Could you pls help ? Thanks Charlie


Solution

  • I inserted a print statement in your normalize routine to debug:

    def normalize(week):                                                            
        print(week)                                                                 
        norm_week = (week/week[0]) -1                                          
        return norm_week 
    

    Here's what printed out:

    
    0      300                                                                                               
    1      500                                                                                               
    2      900                                                                                               
    3     1300                                                                                               
    4      200                                                                                               
    5      400                                                                                               
    6      100                                                                                               
    7      800                                                                                               
    8      500                                                                                               
    9      600                                                                                               
    10     500                                                                                               
    Name: week_1, dtype: int64                                                                               
    11     800                                                                                               
    12    1700                
    13    1800                                                                                               
    14    2000                                                                                               
    15    1500                                          
    Name: week_2, dtype: int64                                                                               
    Traceback (most recent call last):
    ...
    

    There's no index 0 in the week_2 group. Perhaps you meant to use iloc?

    def normalize(week):                                                            
        norm_week = (week/week.iloc[0]) -1                                          
        return norm_week