Search code examples
python-3.xpandasconditional-statementscalculated-columns

Generate a new column in pandas - conditionaly increase integer count


I would like to have a dataframe of pandas like following: Required Dataframe

Column A - according to these values i would like to generate a new variable - Column B.

All the time, when the values change, it should be "Step_(new_number)" made. So the first zero values are Step_1, then the next values should be Step_2, then other row with numbers should be Step_3 and so on.

This is what i have done so far:

def f(row):
    if row['A'] > 0 :
        val = "Step_1"
    else:
        val = "Step_0"
    return val
df['B'] = df.apply(f, axis=1)

But i do not understand, how to conditionaly increase the count of the values in the new column B. Please note that Step_1 could be replaced by "1" - it does not need to be a text, in order to make the solution easier.


Solution

  • One option is to create some global variables to track the value and the step and reference/update those inside the function that you use with pandas.Series.apply like so:

    import pandas as pd
    
    df = pd.DataFrame(dict(
        A = [0,0,0,44.67,44.67,0,0,35.49,35.49,35.49,0]
    ))
    
    step = 0
    value = None
    
    def get_step(x):
        global step
        global value
        if x != value:
            value = x
            step += 1
        return f'Step_{step}'
    
    df['B'] = df['A'].apply(get_step)
    print(df)
    
    

    Output:

            A       B
    0    0.00  Step_1
    1    0.00  Step_1
    2    0.00  Step_1
    3   44.67  Step_2
    4   44.67  Step_2
    5    0.00  Step_3
    6    0.00  Step_3
    7   35.49  Step_4
    8   35.49  Step_4
    9   35.49  Step_4
    10   0.00  Step_5
    
    

    Example code in python tutor