Search code examples
rregressionquantregquantile-regression

Misuse predict.rq in the package quantreg?


I am using quantreg package to predict new data based on training set. However, I noticed a discrepancy between predict.rq or predict and doing it manually. Here is an example:

The quantile regression setting is

N = 10000
   tauList = seq(1:11/12)/12
   y = rchisq(N,2)
   X = matrix( rnorm(3*N)  ,nrow = N, ncol = 3 )
   fit <- rq( y ~ X-1, tau = tauList, method = "fn")

The new data set I want to predict is

newdata <- matrix( rbeta((3*N),2,2)    ,nrow = N,ncol=3 )

I use predict.rq or predict to predict newdata. Both return the same result:

fit_use_predict <- predict.rq( fit, newdata = as.data.frame(newdata) )

Also I manually do the prediction based on the coefficients matrix:

coef_mat <- coef(fit)
fit_use_multiplication <- newdata %*% coef_mat

I expect both are numerically identical, but they are not:

diff <- fit_use_predict - fit_use_multiplication
print(diff)

Their difference cannot be negligible.

However, predicting the original data set X, both return the same result, i.e.,

predict(fit, newdata = data.frame(X)) = X %*% coef_mat  ## True

Do I miss something when using the function? Thanks!


Solution

  • A more serious problem here, before we get to prediction is that the model is forcing all of the fitted quantile functions through the origin of design space and since the covariates are centered at the origin all of the quantile functions are forced to cross there. Even if the X's all lie in the positive orthant it is quite a strong assumption to say that the distribution of the response is degenerate at the origin.