Search code examples
python-3.xpandascounter

Pandas counter that counts by skipping a row and reset on different values


Hi I am trying to create a counter that counts my trend column by skipping a row and reset itself if the string values are different. For example on row 9 it will count 2 since the previous skipped row it was counted with a 1. But it resets back to one since the value at row 11 is different from row 9. Is there anyway I could do this?

DateTimeStarted         50% Quantile    50Q shift 2H    Trend    Count
0   2020-12-18 15:00:00 554.0           NaN             Flat     1
1   2020-12-18 16:00:00 593.0           NaN             Flat     1
2   2020-12-18 17:00:00 534.0           554.0           Down     1
3   2020-12-18 18:00:00 562.0           593.0           Down     1
4   2020-12-18 19:00:00 552.0           534.0           Up       1
5   2020-12-18 20:00:00 592.0           562.0           Up       1
6   2020-12-19 08:00:00 511.0           552.0           Down     1
7   2020-12-19 09:00:00 584.0           592.0           Down     1
8   2020-12-19 10:00:00 576.0           511.0           Up       1
9   2020-12-19 11:00:00 545.5           584.0           Down     2
10  2020-12-19 12:00:00 609.5           576.0           Up       2
11  2020-12-19 13:00:00 548.0           545.5           Up       1
12  2020-12-19 14:00:00 565.0           609.5           Down     1
13  2020-12-19 15:00:00 575.0           548.0           Up       2
14  2020-12-19 16:00:00 570.0           565.0           Up       1
15  2020-12-19 17:00:00 557.0           575.0           Down     1 
16  2020-12-19 18:00:00 578.0           570.0           Up       2
17  2020-12-19 19:00:00 578.5           557.0           Up       1
18  2020-12-21 08:00:00 543.0           578.0           Down     1
19  2020-12-21 09:00:00 558.0           578.5           Down     1
20  2020-12-21 10:00:00 570.0           543.0           Up       1

Solution

  • You can shift() the Trend column by 2 and check if it equals Trend:

    df['Counter'] = df.Trend.shift(2).eq(df.Trend).astype(int).add(1)
    

    I named it Counter here for comparison:

            DateTimeStarted  50%Quantile  50Qshift2H Trend  Count  Counter
    0   2020-12-18 15:00:00        554.0         NaN  Flat      1        1
    1   2020-12-18 16:00:00        593.0         NaN  Flat      1        1
    2   2020-12-18 17:00:00        534.0       554.0  Down      1        1
    3   2020-12-18 18:00:00        562.0       593.0  Down      1        1
    4   2020-12-18 19:00:00        552.0       534.0    Up      1        1
    5   2020-12-18 20:00:00        592.0       562.0    Up      1        1
    6   2020-12-19 08:00:00        511.0       552.0  Down      1        1
    7   2020-12-19 09:00:00        584.0       592.0  Down      1        1
    8   2020-12-19 10:00:00        576.0       511.0    Up      1        1
    9   2020-12-19 11:00:00        545.5       584.0  Down      2        2
    10  2020-12-19 12:00:00        609.5       576.0    Up      2        2
    11  2020-12-19 13:00:00        548.0       545.5    Up      1        1
    12  2020-12-19 14:00:00        565.0       609.5  Down      1        1
    13  2020-12-19 15:00:00        575.0       548.0    Up      2        2
    14  2020-12-19 16:00:00        570.0       565.0    Up      1        1
    15  2020-12-19 17:00:00        557.0       575.0  Down      1        1
    16  2020-12-19 18:00:00        578.0       570.0    Up      2        2
    17  2020-12-19 19:00:00        578.5       557.0    Up      1        1
    18  2020-12-21 08:00:00        543.0       578.0  Down      1        1
    19  2020-12-21 09:00:00        558.0       578.5  Down      1        1
    20  2020-12-21 10:00:00        570.0       543.0    Up      1        1