I have 1241 daily data from 2012-11-19 to 2017-10-16 but only for week day (for the number of service in a cafeteria). I'm trying to do to prediction,but I have trouble initializing my time series:
timeseries = ts(passage, frequency = 365,
start = c(2012, as.numeric(format(as.Date("2012-11-19"), "%j"))),
end = c(2017, as.numeric(format(as.Date("2017-10-16"), "%j"))) )
If I do like that, because of missing weekend, my variable will loop back after getting to 1241, all the way to 1791 (which correspond to the number of day between my 2 date) and if I want to make a train time series, choosing a date with the parameter "end" will make it not corresponding to the actual date's data.
So I can I overcome this problem? I know that I can create my time series directly with ( and I'm choosing the right frequency ?, if I put 5 or 7 the axis go into very far years)
timeseries = ts(passage, frequency = 365)
but I loose the ability to choose a start and en date and can't see that information in a plot
Edit: The reason I want to keep it to weekly data with 5 day is so when I plot the forecast, I don't get lots of zero in the plot
plot(forecast(timeseries_00))
like this
if I understand your problem correctly, this one could be a solution:
Step 1) I create a time series (passage) with length 1241 like yours.
passage<-rep(1:1241)
Step 2) I convert the time series in a matrix where every single column is a working day (adding 4 zeros because the time series end at monday), after that I add two additional columns to the matrix with zero values (Saturday and Sunday), I come back to a time series using function unmatrix (package gdata) and I delete the last 6 zeros (4 added by myself and 2 coming from Sunday and Saturday columns)
passage_matrix<-cbind(t(matrix(c(passage,c(0,0,0,0)),nrow = 5)),0,0)
library(gdata)
passage_00<-as.numeric(unmatrix( passage_matrix ,byrow=T))
passage_00<-passage_00[1:(length(passage_00)-6)]
Step 3) I create my new time series
timeseries_00 = ts(passage_00,
frequency = 365,
start = c(2012, as.numeric(format(as.Date("2012-11-19"),
"%j"))))
Step 4) Now I'm able to plot the time series with correct date label (just for working days in my exemple below)
date<-seq(from=as.Date("2012-11-19"),by=1,length.out=length(timeseries_00))
plot(timeseries_00[timeseries_00>0],axes=F)
axis(1, at=1:length(timeseries_00[timeseries_00>0]), labels=date[timeseries_00>0])
"passage" time series with right date
Step 4) Forecast the time series
for_00<-forecast(timeseries_00)
Step 5) I have to modify my original time series in order to have same length beetween forecast data and original data
length(for_00$mean) #length of the prediction
passage_00extended<-c(passage_00,rep(0,730)) #Add zeros for future date
timeseries_00extended = ts(passage_00extended, frequency = 365,
start = c(2012, as.numeric(format(as.Date("2012-11-19"), "%j"))))
date<-seq(from=as.Date("2012-11-19"),by=1,length.out=length(timeseries_00extended))
Step 6) I have to modify predicted data in order to have the same length of timeseries_00extended, all fake data (0 values) are changed in "NA"
pred_mean<-c(rep(NA,length(passage_00)),for_00$mean) #Prediction mean
pred_upper<-c(rep(NA,length(passage_00)),for_00$upper[,2]) #Upper 95%
pred_lower<-c(rep(NA,length(passage_00)),for_00$lower[,2]) #Lower 95%
passage_00extended[passage_00extended==0]<-rep(NA,sum(passage_00extended==0))
Step 7) I plot original data (passage_00extended) and predictions on the same plot (with different colours for mean [blue] and upper and lower bound [orange])
plot(passage_00extended,axes=F,ylim=c(1,max(pred_upper[!is.na(pred_upper)])))
lines(pred_mean,col="Blue")
lines(pred_upper,col="orange")
lines(pred_lower,col="orange")
axis(1, at=1:length(timeseries_00extended), labels=date)