Search code examples
pythonpandasapplyseriesarray-broadcasting

pandas apply typeError: 'float' object is not subscriptable


I have a dataframe df_tr like this:

      item_id    target   target_sum  target_count
0        0          0           1            50            
1        0          0           1            50              

I'm trying to find the mean of the target but excluding the target value of the current row, and put the mean value in a new column. The result would be:

     item_id    target   target_sum  target_count item_id_mean_target
0        0          0           1            50           0.02041
1        0          0           1            50           0.02041

where I computed item_id_mean_target value from the formula:

target_sum - target/target_count - 1

...with this code:

df_tr['item_id_mean_target'] = df_tr.target.apply(lambda x: (x['target_sum']-x)/(x['target_count']-1))     

I think my solution is correct but instead I got:

TypeError: 'float' object is not subscriptable                

Solution

  • No need for apply here, pandas (and therefore numpy) broadcasts operations.

    df['item_id_mean_target'] = (df.target_sum - df.target) / (df.target_count - 1)
    

    df
    
       item_id  target  target_sum  target_count  item_id_mean_target
    0        0       0           1            50             0.020408
    1        0       0           1            50             0.020408
    

    As for why your error occurs, you are calling apply on a pd.Series object, therefore, you cannot reference any other columns inside the apply (since it only receives scalar values).

    To fix it, you'd need to do df.apply(...) but at that point, you're penalised with low performance, so, I wouldn't recommend doing it.