I would like to have a dataframe of pandas like following: Required Dataframe
Column A - according to these values i would like to generate a new variable - Column B.
All the time, when the values change, it should be "Step_(new_number)" made. So the first zero values are Step_1, then the next values should be Step_2, then other row with numbers should be Step_3 and so on.
This is what i have done so far:
def f(row):
if row['A'] > 0 :
val = "Step_1"
else:
val = "Step_0"
return val
df['B'] = df.apply(f, axis=1)
But i do not understand, how to conditionaly increase the count of the values in the new column B. Please note that Step_1 could be replaced by "1" - it does not need to be a text, in order to make the solution easier.
One option is to create some global variables to track the value and the step and reference/update those inside the function that you use with pandas.Series.apply
like so:
import pandas as pd
df = pd.DataFrame(dict(
A = [0,0,0,44.67,44.67,0,0,35.49,35.49,35.49,0]
))
step = 0
value = None
def get_step(x):
global step
global value
if x != value:
value = x
step += 1
return f'Step_{step}'
df['B'] = df['A'].apply(get_step)
print(df)
Output:
A B
0 0.00 Step_1
1 0.00 Step_1
2 0.00 Step_1
3 44.67 Step_2
4 44.67 Step_2
5 0.00 Step_3
6 0.00 Step_3
7 35.49 Step_4
8 35.49 Step_4
9 35.49 Step_4
10 0.00 Step_5