Search code examples
pandasloopslambdaiterationapply

Iteration over rows and apply function based on condition


I am learning iteration and apply function in pandas. I have an example below. for every 4 rows apply a function (1st row + 0 , 2nd row + 0, 3rd row = 0, 4th row = 0) and so on. Any link resources to do this operation? Many Thanks.

data = {'DATE': ['2023-12-29', '2023-12-29', '2023-12-29', '2023-12-29','2024-01-31','2024-01-31','2024-01-31','2024-01-31',
                '2024-02-27','2024-02-27','2024-02-27','2024-02-27'],
        'score': [10, 5, 30, 41,12,7,32,43,14,9,34,45]}

df = pd.DataFrame(data=data)


        DATE      Score Result
    0   2023-12-29  10  10
    1   2023-12-29  5   5
    2   2023-12-29  30  0   
    3   2023-12-29  41  0   
    4   2024-01-31  12  12
    5   2024-01-31  7   7
    6   2024-01-31  32  0   
    7   2024-01-31  43  0   
    8   2024-02-27  14  14
    9   2024-02-27  9   9   
    10  2024-02-27  34  0    
    11  2024-02-27  45  0

Solution

  • Code

    cond = df.index.to_series().mod(4).isin([2, 3])
    df['result'] = df['score'].mask(cond, 0)
    

    df

              DATE  score  result
    0   2023-12-29     10      10
    1   2023-12-29      5       5
    2   2023-12-29      8       0
    3   2023-12-29      2       0
    4   2023-12-29      7       7
    5   2023-12-27     10      10
    6   2023-12-27     12       0
    7   2023-12-27      7       0
    8   2023-12-27      9       9
    9   2024-01-31     13      13
    10  2024-02-27     14       0
    11  2024-02-27      9       0
    

    Update Answer for additional question

    (1st row +5 , 2nd row +8, 3rd row = 0, 4th row -10)

    s = pd.Series([5, 8, 0, -10])[df.index % 4].reset_index(drop=True)
    df['result'] = df['score'] + s
    

    df

              DATE  score  result
    0   2023-12-29     10      15
    1   2023-12-29      5      13
    2   2023-12-29      8       8
    3   2023-12-29      2      -8
    4   2023-12-29      7      12
    5   2023-12-27     10      18
    6   2023-12-27     12      12
    7   2023-12-27      7      -3
    8   2023-12-27      9      14
    9   2024-01-31     13      21
    10  2024-02-27     14      14
    11  2024-02-27      9      -1
    

    It's easier to do indexing numpy array, but you might not know numpy, so I did indexing on a pandas series.