I am looking to manipulate a large set of data based on a couple of conditionals. One is based on the same row whereas the other is based on a cell from a different row.
For example i have a df like this,where I have used
df['true'] = df.loc[:,['max','value']].min(axis=1)
to add the 'true' column
max value true
0,00 3,00 0,00
0,00 4,00 0,00
0,00 4,00 0,00
0,00 2,00 0,00
3,00 3,00 3,00
3,00 1,00 1,00
3,00 4,00 3,00
3,00 1,00 1,00
3,00 4,00 3,00
3,00 0,00 0,00
But I would also like to add a condition to the column 'true' based on the value of a cell in 'max' two rows up. Like .loc I would like to check the entire dataframe for this condition with something like
df.loc[df['max'] - 2 = 0,'true'] = 0
and since pandas is very good at querying i don't want to write an if statement to iterate over the entire dataframe.
The output in this case would be:
max value true
0,00 3,00 0,00
0,00 4,00 0,00
0,00 4,00 0,00
0,00 2,00 0,00
3,00 3,00 **0,00**
3,00 1,00 **0,00**
3,00 4,00 3,00
3,00 1,00 1,00
3,00 4,00 3,00
3,00 0,00 0,00
Any suggestions?
Thank you
Solution thx @EdChum =
df.loc[df['max'].shift(+ 2) == 0,'true'] = 0
I think you need shift
with parameter periods = 2
:
df.loc[df['max'].shift(periods = 2) == 0,'true'] = 0
print df
max value true
0 0 3 0
1 0 4 0
2 0 4 0
3 0 2 0
4 3 3 0
5 3 1 0
6 3 4 3
7 3 1 1
8 3 4 3
9 3 0 0