I am trying to use predict.boosting for new data in adabag package. I can't find a way to use it for data without labels (or any other function from that package).
I am trying:
pr <- predict.boosting(modelfit, test[,2:ncol(test)])
It gives:
Error in `[.data.frame`(newdata, , as.character(object$formula[[2]])) :
undefined columns selected
However, if I include labels:
pr <- predict.boosting(modelfit, test)
it works just fine. But there has to be a way to use it as a predictive model for data without labels.
Thanks for any help!
EDIT Example from package:
library(rusboost)
library(rpart)
data(iris)
make it an unbalanced dataset by removing most of the setosa observations
df <- iris[41:150,]
create binary variable
df$Setosa <- factor(ifelse(df$Species == "setosa", "setosa", "notsetosa"))
create index of negative examples
idx <- df$Setosa == "notsetosa"
run model
test.rusboost <- rusb(Setosa ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
data = df, boot = F, iters = 20, sampleFraction = .1, idx = idx)
predict.boosting(test.rusboost, df)
predict.boosting(test.rusboost, df[,1:4)
You should control that all your columns in train
(the set you used to train the model) are present in test
an with the same name.
Please check:
all(colnames(train) %in% colnames(test))
If it's false, you will need to control how you built train and test.
If it's TRUE, and in general, please provide a reproductible example.
Edit:
A nice way to control that columns are the same, and they contain the same factors is to use sameShape
from dataPreparation package. If it's not the cas, it will add levels and columns (and warn you).
To use it:
library(dataPreparation)
test <- sameShape(test, train)