Search code examples
rcase-whentibble

How to create a column based on conditions with rows


I have the following problem:

Shared_ID<-c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5)
Individual_ID<-c(11,12,13,21,22,23,31,32,33,41,42,43,51,52,53)
Individual_Con<-c(1,2,3,1,1,1,2,2,2,3,3,3,3,2,1)

table<-tibble(Shared_ID,Individual_ID,Individual_Con)
table

what I'm looking for is a way to make a new column called Shared_Con where: for each Shared_ID shows a number based on the following:

Individual_Con==1 ~ 1
Individual_Con==2 ~ 2
Individual_Con==3 ~ 3
any combination of Individual_Con ~ 4 

For me this means that if all the Individual_Con within a Shared_ID are x.e equal to 1, then Shared_Con will be 1, and the last case is if there are at least 2 different Individual_Con per Shared_ID then Shared_Con will be 4

This is my desire result:

# A tibble: 15 x 4
   Shared_ID Individual_ID Individual_Con Shared_Con
       <dbl>         <dbl>          <dbl>      <dbl>
 1         1            11              1          4
 2         1            12              2          4
 3         1            13              3          4
 4         2            21              1          1
 5         2            22              1          1
 6         2            23              1          1
 7         3            31              2          2
 8         3            32              2          2
 9         3            33              2          2
10         4            41              3          3
11         4            42              3          3
12         4            43              3          3
13         5            51              3          4
14         5            52              2          4
15         5            53              1          4

How can I make this easily? Thanks in advance for any help!


Solution

  • We can do a group by 'Shared_ID', check whether the number of distinct elements in 'Individual_Con' are greater than 1 then return 4 or else return the Individual_Con

    library(dplyr)
    table %>%
         group_by(Shared_ID) %>%
         mutate(Shared_Con = if(n_distinct(Individual_Con) > 1) 4 else Individual_Con)
    # A tibble: 15 x 4
    # Groups:   Shared_ID [5]
    #   Shared_ID Individual_ID Individual_Con Shared_Con
    #       <dbl>         <dbl>          <dbl>      <dbl>
    # 1         1            11              1          4
    # 2         1            12              2          4
    # 3         1            13              3          4
    # 4         2            21              1          1
    # 5         2            22              1          1
    # 6         2            23              1          1
    # 7         3            31              2          2
    # 8         3            32              2          2
    # 9         3            33              2          2
    #10         4            41              3          3
    #11         4            42              3          3
    #12         4            43              3          3
    #13         5            51              3          4
    #14         5            52              2          4
    #15         5            53              1          4