I want to fit a time series model using xgboost for R and I want to use only the last observation for testing the model (in a rolling window forecast, there will be more in total). But when I include only a single value in the test data I get the error: Error in xgb.DMatrix(data = X[n, ], label = y[n]) : xgb.DMatrix does not support construction from double
. Is it possible to do this, or do I need a minimum of 2 test points?
Reproducible example:
n = 1000
X = cbind(runif(n,0,20), runif(n,0,20))
y = X %*% c(2,3) + rnorm(n,0,0.1)
train = xgb.DMatrix(data = X[-n,],
label = y[-n])
test = xgb.DMatrix(data = X[n,],
label = y[n]) # error here, y[.] has 1 value
test2 = xgb.DMatrix(data = X[(n-1):n,],
label = y[(n-1):n]) # works here, y[.] has 2 values
There's another post here that addresses a similar issue, however it refers to the predict()
function, whereas I refer to the test
data that will later go into the watchlist
argument of xgboost and used e.g. for early stopping.
The problem here is with the subset operation of the matrix
with a single index. See,
class(X[n, ])
# [1] "numeric"
class(X[n,, drop = FALSE])
#[1] "matrix" "array"
Use X[n,, drop = FALSE]
to get the test sample.
test = xgb.DMatrix(data = X[n,, drop = FALSE], label = y[n])
xgb.model <- xgboost(data = train, nrounds = 15)
predict(xgb.model, test)
# [1] 62.28553