I'm trying to predict probability of X_test and getting 2 values in an array. I need to compare those 2 values and make it 1.
when I write code
y_pred = classifier.predict_proba(X_test)
y_pred
It gives output like
array([[0.5, 0.5],
[0.6, 0.4],
[0.7, 0.3],
...,
[0.5, 0.5],
[0.4, 0.6],
[0.3, 0.7]])
We know that if values if >= 0.5 then it's and 1 and if it's less than 0.5 it's 0
I converted the above array into pandas using below code
proba = pd.DataFrame(proba)
proba.columns = [['pred_0', 'pred_1']]
proba.head()
And output is
pred_0 pred_1
0 0.5 0.5
1 0.6 0.4
2 0.7 0.3
3 0.4 0.6
4 0.3 0.7
How to iterate the above rows and write a condition that if row value of column 1 is greater than equal to 0.5 with row value of 2, then it's 1 and if row value of column 1 is less than 0.5 when compared to row value of column 2.
For example, by seeing the above data frame the output must be
output
0 0
1 1
2 1
3 1
4 1
You could just map your initial array without converting it to a Pandas Dataframe so that it returns True when the first value of every subarray is >= 0.5 and if not it returns False. And finally, convert it to int:
>>> import numpy as np
>>> a = np.array([[0.5, 0.5], [0.6, 0.4], [0.3, 0.7]])
>>> a
array([[0.5, 0.5],
[0.6, 0.4],
[0.3, 0.7]])
>>> result = map(lambda x:int(x[0] >= 0.5), a)
>>> print(list(result))
[1, 1, 0]