Search code examples
rgroup-byconditional-statementsfilldplyr

R conditional mutate date column using groupby and fill


similarly worded questions out there but none show quite what I'm looking to do. I have a sample of a dataframe below. I would like to group_by ID and create a Date2 column where rank=2. I'm having a hard time figuring this out.

     ID   Rank    Date       Date2
1   5678   1    2000-01-01   2010-05-02    
2   5678   2    2010-05-02   2010-05-02  
3   1234   1    2000-01-01   2015-06-03     
4   1234   2    2015-06-03   2015-06-03   

Here's what I have so far:

df <- df %>% group_by(ID) %>%fill(Date2,.direction='up')

How can I go about this?


Solution

  • Try this:

    library(dplyr)
    #Code
    df %>% group_by(ID) %>% mutate(Date2=Date[Rank==2])
    

    Output:

    # A tibble: 4 x 4
    # Groups:   ID [2]
         ID  Rank Date       Date2     
      <int> <int> <chr>      <chr>     
    1  5678     1 2000-01-01 2010-05-02
    2  5678     2 2010-05-02 2010-05-02
    3  1234     1 2000-01-01 2015-06-03
    4  1234     2 2015-06-03 2015-06-03
    

    Some data used:

    #Data
    df <- structure(list(ID = c(5678L, 5678L, 1234L, 1234L), Rank = c(1L, 
    2L, 1L, 2L), Date = c("2000-01-01", "2010-05-02", "2000-01-01", 
    "2015-06-03")), row.names = c("1", "2", "3", "4"), class = "data.frame")
    

    Also if you want to use fill() you can try this code. You would have to use a conditional like ifelse() to assign the date and then fill the values:

    #Code 2
    df %>% group_by(ID) %>%
      mutate(Date2=ifelse(Rank==2,Date,NA)) %>%
      fill(Date2,.direction = 'up')
    

    Output:

    # A tibble: 4 x 4
    # Groups:   ID [2]
         ID  Rank Date       Date2     
      <int> <int> <chr>      <chr>     
    1  5678     1 2000-01-01 2010-05-02
    2  5678     2 2010-05-02 2010-05-02
    3  1234     1 2000-01-01 2015-06-03
    4  1234     2 2015-06-03 2015-06-03