Search code examples
rpredictrobust

How can I use an lmRob model with factors to predict a new value?


I fit a multivariate model with lmRob in the robust package and I like the fit. How can I use the fit to make a prediction at a given point? The hackish solution is to plot it and place horizontal and vertical lines on the plot to pinpoint

How can I feed the model a point, and have it spit back the prediction? I'm imagining it's something like:

predict(model, newdata = data.frame(x = 2, y = 90))

But this gives me the error:

predict(model, newdata = data.frame(x = 2, y = 90))
Error in `contrasts<-`(`*tmp*`, value = contrasts.arg[[nn]]) : 
  contrasts apply only to factors

The traceback() is:

> traceback()
7: stop("contrasts apply only to factors")
6: `contrasts<-`(`*tmp*`, value = contrasts.arg[[nn]])
5: model.matrix.default(delete.response(Terms), newdata, contrasts = object$contrasts, 
       xlevels = attr(object, "xlevels"))
4: model.matrix(delete.response(Terms), newdata, contrasts = object$contrasts, 
       xlevels = attr(object, "xlevels"))
3: predict.lmRob(model, newdata = data.frame(x = 1, 
       y = 90), interval = "predict")
2: predict(model, newdata = data.frame(x = 1, y = 90), 
       interval = "predict")
1: predict(model, newdata = data.frame(x = 1, y = 90), 
       interval = "predict")

If I just try passing the original data set into predict, I get:

Error in x %*% coefs : non-conformable arguments

Adding the appropriate factor levels fixes the first warning, but leaves the second.


Solution

  • You need to make sure the newdata has the same levels as the original, ie.

    dat <- data.frame(x=1:10, y=factor(sample(letters[1:2], 10, rep=T)),
                      z=runif(10))
    fit <- lmRob(z ~ ., data=dat)
    
    ## Fails, wrong factor
    predict(fit, newdata=data.frame(x=11, y="a")) 
    
    ## Works
    predict(fit, newdata=data.frame(x=11, y=factor("a", levels=letters[1:2])))
    

    Edit

    You will get the second error if you do something like this

    dat <- data.frame(x=1:10, y=factor(sample(letters[1:2], 10, rep=T), levels=letters[1:3]),
                      z=runif(10))  # data has empty "c" level
    fit <- lmRob(z ~ ., data=dat)
    
    ## Fails
    predict(fit, newdata=dat)
    
    ## Works
    predict(fit, newdata=droplevels(dat))