data= {'start_value':[10,20,30,40,50,60,70],
'identifier':['+','+','-','-','+','-','-']}
df = pd.DataFrame(data)
start_value identifier
0 10 +
1 20 +
2 30 -
3 40 -
4 50 +
5 60 -
6 70 -
I am attempting to created a new column "end_value" that results in either +5 or -5 to the *"*start_value" column based on the "+" or "-" value in the "identifier" column. Resulting in the df below.
start_value identifier end_value
0 10 + 15.0
1 20 + 25.0
2 30 - 25.0
3 40 - 35.0
4 50 + 55.0
5 60 - 55.0
6 70 - 65.0
Running this code I realize replaces the values in the "end_value" column, resulting in this df
df['end_value'] = 5 + df.loc[df['identifier']=="+"]['start_value']
df['end_value'] = -5 + df.loc[df['identifier']=="-"]['start_value']
start_value identifier end_value
0 10 + NaN
1 20 + NaN
2 30 - 25.0
3 40 - 35.0
4 50 + NaN
5 60 - 55.0
6 70 - 65.0
How would I apply an if statement to combine the results where 5 is added if the identifier col == "+" and 5 is subtracted if the identifier col == "-" ?
I've done something similar with strings using this post below, but I am unsure how to successfully apply this for a mathematical operation resulting in 'end_value' dtype as float.
Pandas: if row in column A contains "x", write "y" to row in column B
You could use .apply()
with a lambda expression.
data= {'start_value':[10,20,30,40,50,60,70],
'identifier':['+','+','-','-','+','-','-']}
df = pd.DataFrame(data)
df["end_value"] = df.apply(lambda row: row.start_value + 5 if row.identifier == "+" else row.start_value - 5, axis=1)
assuming that the values of the idetifier column are either +
or -