Search code examples
pythonpandasmax

Create new column which is max of other columns with conditions


I would like to make a new column which is the max of two things;

The first is the average of two columns, the second is one of those two columns.

So I essentially want forecast to be the max of the following:


df3['forecast'] = df3[['A', 'B']].mean(axis=1)

df3['forecast'] = df3[['A']]

Any help greatly appreciated! Cheers


Solution

  • Use numpy.maximum:

    np.random.seed(123)
    
    df3 = pd.DataFrame(np.random.randint(5, size=(10,3)), columns=list('ABC'))
    
    df3['forecast'] = np.maximum(df3[['A', 'B']].mean(axis=1), df3['A'])
    print (df3)
       A  B  C  forecast
    0  2  4  2       3.0
    1  1  3  2       2.0
    2  3  1  1       3.0
    3  0  1  1       0.5
    4  0  0  1       0.0
    5  3  4  0       3.5
    6  0  4  1       2.0
    7  3  2  4       3.0
    8  2  4  0       3.0
    9  0  1  3       0.5
    

    Or create helper column C, assign mean and last get maximum:

    df3['forecast'] = df3.assign(C=df3[['A', 'B']].mean(axis=1))[['A','C']].max(axis=1)
    print (df3)
       A  B  C  forecast
    0  2  4  2       3.0
    1  1  3  2       2.0
    2  3  1  1       3.0
    3  0  1  1       0.5
    4  0  0  1       0.0
    5  3  4  0       3.5
    6  0  4  1       2.0
    7  3  2  4       3.0
    8  2  4  0       3.0
    9  0  1  3       0.5