I have the following data frame,
f1 f2 f3 f4 f5 f6 f7
0 -0.004446 59.763107 x0 0 60.029999 59.160000 -0.014493
1 -0.003414 113.212220 x1 0 113.599998 113.650002 0.000440
2 -0.013123 36.435513 x2 1 36.919998 36.889999 -0.000813
3 0.003558 68.854090 x3 0 68.420158 68.410179 -0.000146
4 -0.006840 23.021446 x4 0 23.180000 23.100000 -0.003451
... ... ... ... ... ... ... ...
145 0.000724 253.113110 x5 1 252.929993 247.210007 -0.022615
146 0.006567 128.236680 x6 0 127.400002 127.059998 -0.002669
147 -0.009016 610.079200 x7 1 615.630005 605.369995 -0.016666
148 -0.011290 165.173920 x8 0 167.059998 158.300003 -0.052436
149 0.021474 358.496370 x9 0 350.959991 343.329987 -0.021740
Basically, for column f4
, treat 0 as a negative or False and 1 as a positive or True.
If values from columns f1
, f4
and f7
are all negative or all positive, the test column should return true for that row, else it should return false.
I want to create new column called 'test' that will say True
or False
based on my conditions. If in any cases, one of them don't match (either not all true or not all false), it will show False
.
I can make the following code work with 2 columns,
df.loc[:,'test'] = df['f1'].ge(0).eq(df['f4'])
and it works fine.
However, if I try to chain it to add the f7
column like this,
df.loc[:,'test'] = df['f1'].ge(0).eq(df['f4']).eq(df['f7'].ge(0))
the results are wrong.
I want the test column to look like this,
f1 f2 f3 f4 f5 f6 f7 test
0 -0.004446 59.763107 x0 0 60.029999 59.160000 -0.014493 True
1 -0.003414 113.212220 x1 0 113.599998 113.650002 0.000440 False
2 -0.013123 36.435513 x2 1 36.919998 36.889999 -0.000813 False
3 0.003558 68.854090 x3 0 68.420158 68.410179 -0.000146 False
4 -0.006840 23.021446 x4 0 23.180000 23.100000 -0.003451 True
... ... ... ... ... ... ... ...
145 0.000724 253.113110 x5 1 252.929993 247.210007 -0.022615 False
146 0.006567 128.236680 x6 0 127.400002 127.059998 -0.002669 False
147 -0.009016 610.079200 x7 1 615.630005 605.369995 -0.016666 False
148 -0.011290 165.173920 x8 0 167.059998 158.300003 -0.052436 True
149 0.021474 358.496370 x9 0 350.959991 343.329987 -0.021740 False
How do I get the code to work the way I want it to?
maybe because if f1 and f4 are negative, the first two comparisons will return positive, therefore chaining third will return false.
I think what you are trying to do is negative of XOR operation.
Code below will give you the idea.
table = np.array(
[[-5.5,0,-6.6],
[-5.5,0, 6.6],
[-5.5,1,-6.6],
[-5.5,1, 6.6],
[ 5.5,0,-6.6],
[ 5.5,0, 6.6],
[ 5.5,1,-6.6],
[ 5.5,1, 6.6]]
)
pd_table = pd.DataFrame(table)
df = pd_table
and_all = df[0].ge(0) & df[1] & df[2].ge(0)
or_all = df[0].ge(0) | df[1] | df[2].ge(0)
print(and_all.eq(or_all))
The output will be like this:
0 True
1 False
2 False
3 False
4 False
5 False
6 False
7 True
dtype: bool