Search code examples
pythonpandaslambdacut

Can I specify the bins for pandas cut as columns from my dataframe?


Hopefully a really easy one, but I'm a bit stumped as to the correct pandas method to use for my problem.

I'm trying to evaluate a Band for the numbers in the 'Value' column in my dataframe, based on whether they are below, between, or above the values in two other columns (Limit1 and Limit2). For example :

Value     Limit1     Limit2    Band
3         2          5         
5         6          7
5         4          8
9         6          7
2         4          5

pd.cut works if I specify bins as single numbers, but I would like to specify the bins as columns within my dataframe, so that each row has it's own specific bins, as per below

df['Band'] = df.apply(lambda x: pd.cut(x.value, bins=[0, x.Limit1, x.Limit2, np.inf], labels=['Band1','Band2','Band3']))

This fails because I provide a series where the cut function is expecting a single number. Can anyone advise how I might be able to do this with pd.cut, or should I be using a different pandas function altogether?

I would rather avoid np.where, as I may have to expand the bins to five or six, and I don't want to have the nested code.

Many thanks in advance!


Solution

  • Let's try with np.select:

    m1 = df['Value'].lt(df['Limit1'])
    m2 = df['Value'].gt(df['Limit2'])
    
    df['Band'] = np.select([m1, m2], ['band1', 'band3'], 'band2')
    

       Value  Limit1  Limit2   Band
    0      3       2       5  band2
    1      5       6       7  band1
    2      5       4       8  band2
    3      9       6       7  band3
    4      2       4       5  band1