Search code examples
rdatedataframestrptime

How to use date formats to write conditions in data frames


I am facing a problem with the large amount of time my code takes to run. I have a data frame of approximately 23000 columns and 600 rows that folllows the following principle:

date <- c(30032015,30042015,31052015,30062015,31072015,31082015,30092015)
AAPL <- c(10,NA,NA,10,NA,NA,20)
MSFT <- c(10,NA,NA,30,NA,NA,25)
sales <- data.frame (date,AAPL,MSFT)
sales$date <- strptime (sales$date, format="%d%m%Y")

And I want the values of april and may to be equal to the values of march and the same relative to july and august relative to june.

What I am doing is this

sales [is.na(sales)] <- 0

for (i in 1:6){
for (j in 2:3){
sales[i,j] <- ifelse(sales[i,j]>0,sales[i,j],ifelse(sales[i-1,j]>0,sales[i-
1,j],ifelse(sales[i-2,j]>0,sales[i-2,j],NA)))
}}

However for a big data frame is taking a lot of hours. Isn't it possible to somehow say that the values in month 4 and 5 are equal to the values in month 3 and so on?

Thank you in advance


Solution

  • You probably want the na.locf() function from the zoo package. It carries the last observation forward to replace na values.

    require(zoo)
    sales[,2:3] <- apply(sales[,2:3],2,na.locf)