Search code examples
rloopsdatecomparisonrow

compare adjacent rows in R


In my dataframe, I have a column "dates" and I would like for R to walk through each row of dates in a loop to see if the date before or after it is within a 3-14 day range, and if not, it's indexed to a list to be removed at the end of the loop.

for example:

my_dates <- c( "1/4/2019", "1/18/2019", "4/3/2019", "2/20/2019", "4/5/2019")

I would want to remove the entire row containing 2/20/2019 because there is no other date that is within 3-14 days of that date.

Any help would be greatly appreciated!


Solution

  • Here's a verbose way using lubridate and dplyr.

    my_dates <- c( "1/4/2019", "1/18/2019", "4/3/2019", "2/20/2019", "4/5/2019")
    
    library(lubridate); library(dplyr)
    df <- data.frame(dates = mdy(my_dates)) %>%
      arrange(dates) %>%
      mutate(days_prior  = dates - lag(dates),
             days_before = lead(dates) - dates) %>%
      mutate(closest_day = pmin(days_prior, days_before, na.rm = T)) %>%
      filter(closest_day <= 14)