Search code examples
rdifference

How to calculate difference of rows based on conditions in R


I have the following dataset:

value  group1  group2   x
1        0       0      NA
2        0       0      NA
7        0       1      2.5
5        1       0      NA
8        1       0      NA
4        1       0      NA
6        0       1      1.5
3        1       0      NA
2        1       0      NA

Now I want to calculate a y column, where y = 0 if group 1 = 0 and y = value - (last value of x that is not NA) if group 1 = 1. So it should look like this:

value  group1  group2   x       y
1        0       0      NA      0
2        0       0      NA      0
7        0       1      2.5     0
5        1       0      NA      2.5
8        1       0      NA      5.5
4        1       0      NA      1.5
6        0       1      1.5     0
3        1       0      NA      1.5
2        1       0      NA      0.5

I would really appreciate any help. Thanks a lot!


Solution

  • Using zoo::na.locf0 -

    transform(df, y = ifelse(group1 == 0, 0, value - zoo::na.locf0(x)))
    
    #  value group1 group2   x   y
    #1     1      0      0  NA 0.0
    #2     2      0      0  NA 0.0
    #3     7      0      1 2.5 0.0
    #4     5      1      0  NA 2.5
    #5     8      1      0  NA 5.5
    #6     4      1      0  NA 1.5
    #7     6      0      1 1.5 0.0
    #8     3      1      0  NA 1.5
    #9     2      1      0  NA 0.5