Search code examples
rdplyrtail

dplyr and tail to change last value in a group_by in r


while using dplyr i'm having trouble changing the last value my data frame. i want to group by user and tag and change the Time to 0 for the last value / row in the group.

     user_id     tag   Time
1  268096674       1    3
2  268096674       1    10
3  268096674       1    1
4  268096674       1    0
5  268096674       1    9999
6  268096674       2    0
7  268096674       2    9
8  268096674       2    500
9  268096674       3    0
10 268096674       3    1
...

Desired output:

     user_id     tag   Time
1  268096674       1    3
2  268096674       1    10
3  268096674       1    1
4  268096674       1    0
5  268096674       1    0
6  268096674       2    0
7  268096674       2    9
8  268096674       2    0
9  268096674       3    0
10 268096674       3    1
...

I've tried to do something like this, among others and can't figure it out:

df %>%
  group_by(user_id,tag) %>%
  mutate(tail(Time) <- 0)

I tried adding a row number as well, but couldn't quite put it all together. any help would be appreciated.


Solution

  • I would like to offer an alternative approach which will avoid copying the whole column (what both Time[-n()] and replace do) and allow modifying in place

    library(data.table)
    indx <- setDT(df)[, .I[.N], by = .(user_id, tag)]$V1 # finding the last incidences per group
    df[indx, Time := 0L] # modifying in place
    df
    #       user_id tag Time
    #  1: 268096674   1    3
    #  2: 268096674   1   10
    #  3: 268096674   1    1
    #  4: 268096674   1    0
    #  5: 268096674   1    0
    #  6: 268096674   2    0
    #  7: 268096674   2    9
    #  8: 268096674   2    0
    #  9: 268096674   3    0
    # 10: 268096674   3    0