Search code examples
pythonpandasdataframecompare

compare multiple columns of pandas dataframe with one column


I have a dataframe: df-

    A   B   C   D   E
0   V   10  5   18  20
1   W   9   18  11  13
2   X   8   7   12  5
3   Y   7   9   7   8
4   Z   6   5   3   90

I want to add a column 'Result' which should return 1 if the value in column 'E' is greater than the values in B, C & D columns else return 0.

Output should be:

    A   B   C   D   E   Result
0   V   10  5   18  20  1
1   W   9   18  11  13  0
2   X   8   7   12  5   0
3   Y   7   9   7   8   0
4   Z   6   5   3   90  1

For few columns, i would use logic like : if(and(E>B,E>C,E>D),1,0), But I have to compare around 20 columns (from B to U) with column name 'V'. Additionally, the dataframe has around 100 thousand rows.

I am using

df['Result']=np.where((df.ix[:,1:20])<df['V']).all(1),1,0)

And it gives a Memory error.


Solution

  • One possible solution is compare in numpy and last convert boolean mask to ints:

    df['Result'] = (df.iloc[:, 1:4].values < df[['E']].values).all(axis=1).astype(int)
    print (df)
       A   B   C   D   E  Result
    0  V  10   5  18  20       1
    1  W   9  18  11  13       0
    2  X   8   7  12   5       0
    3  Y   7   9   7   8       0
    4  Z   6   5   3  90       1