Search code examples
rdataframedplyrlubridate

Change values from a column for two (or N) consecutive days if condition met in R


I have following dataframe:

df <- structure(list(DateTime = structure(c(1477978200, 1477980000, 1477981800, 1477983600, 1477985400, 1477987200, 1477989000, 1477990800, 1477992600, 1477994400, 1477996200, 1477998000, 1477999800, 1478001600, 1478003400, 1478005200, 1478007000, 1478008800, 1478010600, 1478012400, 1478014200, 1478016000, 1478017800, 1478019600, 1478021400, 1478023200, 1478025000, 1478026800, 1478028600, 1478030400, 1478032200, 1478034000, 1478035800, 1478037600, 1478039400, 1478041200, 1478043000, 1478044800, 1478046600, 1478048400, 1478050200, 1478052000, 1478053800, 1478055600, 1478057400, 1478059200, 1478061000, 1478062800, 1478064600, 1478066400, 1478068200, 1478070000, 1478071800, 1478073600, 1478075400, 1478077200, 1478079000, 1478080800, 1478082600, 1478084400, 1478086200, 1478088000, 1478089800, 1478091600, 1478093400, 1478095200, 1478097000, 1478098800, 1478100600, 1478102400, 1478104200, 1478106000, 1478107800, 1478109600, 1478111400, 1478113200, 1478115000, 1478116800, 1478118600, 1478120400), class = c("POSIXct", "POSIXt"), tzone = "America/Chicago"), Date = structure(c(17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107), class = "Date"), Rain_daily = c(8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), PET_daily = c(7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4), Mask = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE)), row.names = c(NA, 80L), class = "data.frame")

I want to change values in column "Mask" to "FALSE" for two consecutive days if "Rain_daily" is greater than "PET_daily" for a particular day. For example, for the given dataframe, values in Mask column will be replaced with FALSE for both the days "2016-11-01" and "2016-11-02" (as 8>7).

How can we do this? I want to avoid for looping.


Solution

  • using case_when from dplyr.

    if date meets the condition Rain_daily > PET_daily, chenge Mask for these (df[Rain_daily > PET_daily, Date]) and next (df[Rain_daily > PET_daily, Date]+1) days. elsewhere leave Mask as it is (TRUE ~ Mask)

    library(data.table)
    library(dplyr)
    
    df <- df %>% as.data.table() # to simlify operations on df
    df <- df %>% 
      mutate(Mask = case_when(Date %in% c(df[Rain_daily > PET_daily, Date],df[Rain_daily > PET_daily, Date]+1) ~ FALSE,
                              TRUE ~Mask ))
    

    Hope that helps