Search code examples
rdataframegroup-byconditional-formatting

Assign value to other group members with conditions in a complex data structure


A MWE is as follows:

df = tibble(
      Group = c(rep(1,9), rep(2,9)),
      Stage = rep(c(1,1,1,rep(2,6)),2),
      Period = rep(c(rep(1,6),2,2,2),2),
      Role = rep(c(NA,NA,NA,0,1,0,0,0,1),2),
      Value = round(rnorm(n=18,mean=10,sd=5), digits = 1)
)

# A tibble: 18 x 5
   Group Stage Period  Role Value
   <dbl> <dbl>  <dbl> <dbl> <dbl>
 1     1     1      1    NA  10.8
 2     1     1      1    NA  15.5
 3     1     1      1    NA   7.4
 4     1     2      1     0   3.7
 5     1     2      1     1   6.7
 6     1     2      1     0   7.5
 7     1     2      2     0  15.1
 8     1     2      2     0   8.2
 9     1     2      2     1   5.1
10     2     1      1    NA  17.3
11     2     1      1    NA  14.9
12     2     1      1    NA  11.1
13     2     2      1     0  13.3
14     2     2      1     1   3.6
15     2     2      1     0   7.2
16     2     2      2     0  13.1
17     2     2      2     0  16.5
18     2     2      2     1  11  

Notes on df: In df, for each group, they go through Stage 1 and 2 with multiple periods. In stage 2, one group member is assigned a unique role.

My objective: In stage 2, I want to assign Role 1's value to other group members in each period for each group. The desired outcome would be:

# A tibble: 18 x 5
   Group Stage Period  Role Value newValue
   <dbl> <dbl>  <dbl> <dbl> <dbl>  <dbl>
 1     1     1      1    NA  10.8   10.8
 2     1     1      1    NA  15.5   15.5
 3     1     1      1    NA   7.4    7.4
 4     1     2      1     0   3.7    6.7
 5     1     2      1     1   6.7    6.7
 6     1     2      1     0   7.5    6.7
 7     1     2      2     0  15.1    5.1
 8     1     2      2     0   8.2    5.1 
 9     1     2      2     1   5.1    5.1
10     2     1      1    NA  17.3   17.3
11     2     1      1    NA  14.9   14.9
12     2     1      1    NA  11.1   11.1
13     2     2      1     0  13.3    3.6
14     2     2      1     1   3.6    3.6
15     2     2      1     0   7.2    3.6  
16     2     2      2     0  13.1   11 
17     2     2      2     0  16.5   11
18     2     2      2     1  11     11

I can simply separate df into two tibbles by Stage, and then solve the problem with this method suggested by David Arenburg as follows:

df_Stage2 <- df %>% filter (Stage == 2) %>%
  group_by(Group, Period) %>%
  mutate(newValue = Value[Role == 1]) %>%
  ungroup

However, I believe there's a way to do it with the whole data structure. I tried the if_else and case_when methods, for example:

df <- df %>%
  group_by(Group, Period) %>%
  mutate(
    newValue = if_else(Stage == 1, -99, Value[Role == 1])
  ) %>%
  ungroup

However, either method does not work. Is there a way to solve this problem without separating df into two dfs by Stage? Thanks a lot!

In addition, in the solution suggested by David Arenburg, he used Value[Role == 1L] instead of Value[Role == 1], would anyone please explain to me what the L does here?


Solution

  • You may use match here -

    library(dplyr)
    
    df %>%
      group_by(Group, Stage, Period) %>%
      mutate(NewValue = ifelse(Stage == 2, Value[match(1, Role)], Value)) %>%
      ungroup()
    
    #   Group Stage Period  Role Value NewValue
    #   <int> <int>  <int> <int> <dbl>    <dbl>
    # 1     1     1      1    NA  10.8     10.8
    # 2     1     1      1    NA  15.5     15.5
    # 3     1     1      1    NA   7.4      7.4
    # 4     1     2      1     0   3.7      6.7
    # 5     1     2      1     1   6.7      6.7
    # 6     1     2      1     0   7.5      6.7
    # 7     1     2      2     0  15.1      5.1
    # 8     1     2      2     0   8.2      5.1
    # 9     1     2      2     1   5.1      5.1
    #10     2     1      1    NA  17.3     17.3
    #11     2     1      1    NA  14.9     14.9
    #12     2     1      1    NA  11.1     11.1
    #13     2     2      1     0  13.3      3.6
    #14     2     2      1     1   3.6      3.6
    #15     2     2      1     0   7.2      3.6
    #16     2     2      2     0  13.1     11  
    #17     2     2      2     0  16.5     11  
    #18     2     2      2     1  11       11