Search code examples
rdplyrgroup-by

issues using first and mutate with group_by


I am using mutate to create a column depending on the first value of a group

library(tidyverse)
test = data.frame(grp = c(1,1,1,2,2,2), x = c(1,2,3,1,2,3), y = c(1,2,3,1,2,3))

test
  grp x y
1   1 1 1
2   1 2 2
3   1 3 3
4   2 1 1
5   2 2 2
6   2 3 3

test %>% group_by(grp) %>% 
  mutate(y = ifelse(grp[[1]] == x[[1]], y-1, y))

    grp     x     y
  <dbl> <dbl> <dbl>
1     1     1     0
2     1     2     0
3     1     3     0
4     2     1     1
5     2     2     1
6     2     3     1

However output is not as I expected.
Expected output is

    grp     x     y
  <dbl> <dbl> <dbl>
1     1     1     0
2     1     2     1
3     1     3     2
4     2     1     1
5     2     2     2
6     2     3     3

Can you please explain what is happening and how best to get my expected solution?


Solution

  • You need to remove the index [[1]] from grp since it will only change the first value of that group and use that to replace y. Since grp is the group you should avoid indexing it. Just use it as is, i.e.

    library(dplyr)
    
    test %>% 
     group_by(grp) %>% 
     mutate(new_y = ifelse(grp == first(x), y-1, y))
    
    # A tibble: 6 × 4
    # Groups:   grp [2]
        grp     x     y new_y
      <dbl> <dbl> <dbl> <dbl>
    1     1     1     1     0
    2     1     2     2     1
    3     1     3     3     2
    4     2     1     1     1
    5     2     2     2     2
    6     2     3     3     3