My data looks like below.
col_1 col_2
1 1
1 1
p 0
1 1
n 2
n 2
p 0
p 0
I want to calculate values in col_2 from col_1. The logic that i want to apply is: When col_1 value ='p', replace value in col_2 from the previous row's value of col_2 and other values in col_2 does not change for any other value of col_1 and the final expected output is as following:
col_1 col_2
1 1
1 1
p **1**
1 1
n 2
n 2
p **2**
p **2**
I am calculating these columns in addition to others based on a date in the assign() function. This is the only step that I am not able to figure out. Because in this one I am looking at previous row's value; so shift() could work but I only need to look for previous value for col_2 when col_1 ='p'. For the time being I am doing it via for loop that gives me the flexibility to look back 1 row and check/replace value. Because of for loop this is not an efficient solution.
Do you know how to avoid a for loop and do it in more pandas way ?
You can use mask
then ffill
:
df['col_2'] = df['col_2'].mask(df['col_1']=='p').ffill()