Search code examples
python-3.xpandascounter

How to create new column that counts and reset based on a string value in another column


I have a dataframe that goes like this

    A               B       C       B_shifted   C_shifted   Trend
0   553.666667      533.50  574.00  NaN         NaN         Flat
1   590.818182      575.50  595.50  533.50      574.00      Up
2   531.333333      527.50  536.50  575.50      595.50      Down
3   562.000000      562.00  562.00  527.50      536.50      Up
4   551.857143      538.50  557.50  562.00      562.00      Down
5   592.000000      585.50  598.50  538.50      557.50      Up
6   511.000000      511.00  511.00  585.50      598.50      Down
7   564.333333      548.00  590.50  511.00      511.00      Up
8   574.333333      552.00  580.00  548.00      590.50      Flat
9   537.500000      513.25  574.50  552.00      580.00      Down
10  609.500000      582.25  636.75  513.25      574.50      Up
11  535.000000      531.00  565.00  582.25      636.75      Down
12  567.142857      539.50  588.50  531.00      565.00      Up
13  566.625000      546.25  594.25  539.50      588.50      Up
14  576.631579      556.00  598.00  546.25      594.25      Up
15  558.333333      538.00  584.00  556.00      598.00      Down

I am trying to create a column to count the number of times the trend happens. It will increment if the same trend persist (e.g. from Up to Up, Count becomes 2) and reset it back to 1 if the trend changes (e.g. from Up to down, Count becomes 1 again)

This is what I am trying to achieve

    A               B       C       B_shifted   C_shifted   Trend   Counter
0   553.666667      533.50  574.00  NaN         NaN         Flat    1
1   590.818182      575.50  595.50  533.50      574.00      Up      1
2   531.333333      527.50  536.50  575.50      595.50      Down    1
3   562.000000      562.00  562.00  527.50      536.50      Up      1
4   551.857143      538.50  557.50  562.00      562.00      Down    1
5   592.000000      585.50  598.50  538.50      557.50      Up      1
6   511.000000      511.00  511.00  585.50      598.50      Down    1
7   564.333333      548.00  590.50  511.00      511.00      Up      1
8   574.333333      552.00  580.00  548.00      590.50      Flat    1
9   537.500000      513.25  574.50  552.00      580.00      Down    1
10  609.500000      582.25  636.75  513.25      574.50      Up      1
11  535.000000      531.00  565.00  582.25      636.75      Down    1
12  567.142857      539.50  588.50  531.00      565.00      Up      1
13  566.625000      546.25  594.25  539.50      588.50      Up      2
14  576.631579      556.00  598.00  546.25      594.25      Up      3
15  558.333333      538.00  584.00  556.00      598.00      Down    1

Is there anyway to do that?


Solution

  • You can shift() the Trend column to get trending indexes and then cumsum() within the trending groups:

    trending = df.Trend.eq(df.Trend.shift())
    df['Counter'] = trending.groupby(trending).cumsum().add(1)
    

    Output:

                 A       B       C  B_shifted  C_shifted Trend  Counter
    0   553.666667  533.50  574.00        NaN        NaN  Flat        1
    1   590.818182  575.50  595.50     533.50     574.00    Up        1
    2   531.333333  527.50  536.50     575.50     595.50  Down        1
    3   562.000000  562.00  562.00     527.50     536.50    Up        1
    4   551.857143  538.50  557.50     562.00     562.00  Down        1
    5   592.000000  585.50  598.50     538.50     557.50    Up        1
    6   511.000000  511.00  511.00     585.50     598.50  Down        1
    7   564.333333  548.00  590.50     511.00     511.00    Up        1
    8   574.333333  552.00  580.00     548.00     590.50  Flat        1
    9   537.500000  513.25  574.50     552.00     580.00  Down        1
    10  609.500000  582.25  636.75     513.25     574.50    Up        1
    11  535.000000  531.00  565.00     582.25     636.75  Down        1
    12  567.142857  539.50  588.50     531.00     565.00    Up        1
    13  566.625000  546.25  594.25     539.50     588.50    Up        2
    14  576.631579  556.00  598.00     546.25     594.25    Up        3
    15  558.333333  538.00  584.00     556.00     598.00  Down        1