Search code examples
rlag

New column value based on matching previous values in different columns


I'm having trouble generating a new column in my dataframe which is based on matching dates in different columns:

df looks something like this:

ID        date  booked.date   weather
 1  2016-12-01           NA    clouds
 1  2016-12-02   2014-10-24     sunny           
 1  2016-12-03           NA  overcast         
 2  2016-12-01   2015-12-24    clouds           
 2  2016-12-02   2016-12-01     sunny
 2  2016-12-03   2016-12-01  overcast
 2  2016-12-04   2016-01-13     sunny

date indicates the date of stay at the apartment, booking_date tells us when the apartment was booked. Now I would like to add a column booked_weather indicating the weather during the time of booking if this information is included in the df. The output would then look like this:

ID        date  booked.date   weather booked_weather
 1  2016-12-01           NA    clouds             NA
 1  2016-12-02   2014-10-24     sunny             NA
 1  2016-12-03           NA  overcast             NA
 2  2016-12-01   2015-12-24    clouds             NA
 2  2016-12-02   2016-12-01     sunny         clouds
 2  2016-12-03   2016-12-01  overcast         clouds
 2  2016-12-04   2016-01-13     sunny             NA

Note that there are readings for multiple apartment IDs, hence repeated dates with the same weather.

Here is what I have tried, does not quite get me what I need:

df %>%
  mutate(weather_booked = case_when(
    booked.date %in% date ~ weather[booked.date]
  ))

I understand why this won't give me the correct result, but I am not sure how to fix it.


Solution

  • library(tidyverse)
    
    df <- read_table("ID  date  booked.date   weather
     1  2016-12-01           NA    clouds
     1  2016-12-02   2014-10-24     sunny           
     1  2016-12-03           NA  overcast         
     2  2016-12-01   2015-12-24    clouds           
     2  2016-12-02   2016-12-01     sunny
     2  2016-12-03   2016-12-01  overcast
     2  2016-12-04   2016-01-13     sunny") 
    
    
    df %>%  
      mutate(weather_booked = weather[match(booked.date, date)])
    
    
    #> # A tibble: 7 x 5
    #>      ID date       booked.date weather  weather_booked
    #>   <dbl> <date>     <date>      <chr>    <chr>         
    #> 1     1 2016-12-01 NA          clouds   <NA>          
    #> 2     1 2016-12-02 2014-10-24  sunny    <NA>          
    #> 3     1 2016-12-03 NA          overcast <NA>          
    #> 4     2 2016-12-01 2015-12-24  clouds   <NA>          
    #> 5     2 2016-12-02 2016-12-01  sunny    clouds        
    #> 6     2 2016-12-03 2016-12-01  overcast clouds        
    #> 7     2 2016-12-04 2016-01-13  sunny    <NA>
    

    Created on 2022-06-29 by the reprex package (v2.0.1)