Search code examples
rdataframecriteria

Set Certain Values to a Data Frame Based a Date Criteria


I want to set the value NA to certain rows/columns within a dataframe that occurs before a certain date. But each column has a different date/criteria. So how would this work?

Sample Dataframe:

dates <- c("01/01/2015", "06/15/2015", "11/30/2015")
a <- c(1, 2, 3)
b <- c(2, 4, 6)
c <- c(3, 5, 9)
df <- data.frame(Date = dates, A = a, B = b, C = c)

startDate <- c("02/20/2015", "07/28/2015", "12/01/2015")

So my data.frame would look like this:

    Date        A      B     C
    01/01/2015  1      2     3
    06/15/2015  2      4     5
    11/30/2015  3      6     9

Using the startDate as my criteria, I want to set any value before that date in the relative column to NA so that my end result would look like this:

    Date        A      B     C
    01/01/2015  NA     NA    NA
    06/15/2015  2      NA    NA
    11/30/2015  3      6     NA

Can someone please help me? Btw, my real data.frame has about 20+ columns. The above is just a smaller example of my actual problem.

Thanks in advance!


Solution

  • Test this:

    #Do a pairwise comparison of dates using outer
    m = matrix(
        as.numeric(
            outer(as.Date(as.character(df[,1]), format = "%m/%d/%Y"),
                  as.Date(as.character(startDate), format = "%m/%d/%Y"), ">" )
        ),
        nrow = nrow(df))
    
    m[m == 0] = NA #Set zeroes in m to NA
    df[,2:ncol(df)] = df[,2:ncol(df)] * m #Multiply the columns of df (except 1st) with m
    df
    #        Date  A  B  C
    #1 01/01/2015 NA NA NA
    #2 06/15/2015  2 NA NA
    #3 11/30/2015  3  6 NA