I need your help in creating a data frame in R which contains separate columns of each forecasted variable. This "new data" will be used for predict line after regression.
The classical example online is as follows:
data=read.csv("some_path", header=TRUE)
fit<- with(data, lm(y1 ~x1))
pred<-predict.lm(fit, newdata=data.frame("x1"=some_number))
The model has three steps:
fit_x1<- auto.arima(x1, stepwise = FALSE)
for_x1<-forecast(fit_x1, h=10)
...
fit_xn<- auto.arima(xn, stepwise = FALSE)
for_xn<-forecast(fit_xn, h=10)
After forecasting I need to create a data frame which contains each forecasted variable (x1...xn). Moreover, the "new data" should contain the columns with the exact names of X's used later in the regression. In these columns only the values of point forecasts should be saved, without Hi and Lo values.
fit<- lm(formula=log(y1)~log(x1)+log(x2)+...+log(xn), data=data)
summary(fit)
pred<- predict.lm(fit, newdata)
There is a very interesting post on this issue: Combining forecasts into a data frame in R and then exporting into excel However, in that particular case a new data frame is created from forecasts that produces only one column where forecasts from different x's are combined.
Can this three step procedure work in case of panel data time series?
I guess the following code should help you to create a newdata
data.frame
.
# library(dplyr)
# library(stats)
# library(forecast)
set.seed(123)
dta <- ts(dplyr::tibble(
AA = arima.sim(list(order=c(1,0,0), ar=.5), n=100, mean = 12),
AB = arima.sim(list(order=c(1,0,0), ar=.5), n=100, mean = 12),
AC = arima.sim(list(order=c(1,0,0), ar=.5), n=100, mean = 11),
BA = arima.sim(list(order=c(1,0,0), ar=.5), n=100, mean = 10),
BB = arima.sim(list(order=c(1,0,0), ar=.5), n=100, mean = 14)
), start = c(2013, 1), frequency = 12)
head(dta)
tail(dta)
data.frame
/matrix
.nseries <- ncol(dta)
h <- 12 # forecast horizon
newdata <- matrix(nrow = h, ncol = nseries) # empty newdata matrix
for (i in seq_len(nseries)) {
newdata[,i] <- forecast::forecast(forecast::auto.arima(dta[,i]), h = h)$mean
}
colnames(newdata) <- colnames(dta)
head(newdata)
I hope I understood the problem correctly.