I am trying to use the mlr package in R for applying feature selection to a bagged learner, using sequential forward search.
d <- data.frame(a = rnorm(1000, mean = 1),
b = rnorm(1000, mean = 2),
c = rnorm(1000, mean = 3),
target = as.factor(rbinom(1000, 1, prob = 0.5)))
t <- makeClassifTask(data = d,
target = 'target',
positive = '1')
logreg.lrn <- makeLearner('classif.logreg')
logreg_bagged.lrn <- makeBaggingWrapper(logreg.lrn)
cntrl.sfs <- makeFeatSelControlSequential(method = "sfs",
alpha = 0.01,
max.features = 10,
maxit = 3)
logreg_bagged_featsel.lrn <- makeFeatSelWrapper(logreg_bagged.lrn,
resampling = makeResampleDesc('CV',
iters = 3),
measures = mmce,
control = cntrl.sfs)
mlr::train(logreg_bagged_featsel.lrn, classif.task)
Returns the following error:
[FeatSel] Started selecting features for learner 'classif.logreg.bagged'
With control class: FeatSelControlSequential
Imputation value: 1
[FeatSel-x] 1: 000 (0 bits)
Error in mlr::train(logreg_bagged_featsel.lrn, classif.task) :
Assertion on '.newdata' failed: Must have at least 1 cols, but has 0 cols.
When I use a sequential backward search instead, the error does not occur:
cntrl.sbs <- makeFeatSelControlSequential(method = "sbs",
alpha = 0.01,
max.features = 10,
maxit = 3)
logreg_bagged_featsel.lrn <- makeFeatSelWrapper(logreg_bagged.lrn,
resampling = makeResampleDesc('CV',
iters = 3),
measures = mmce,
control = cntrl.sbs)
mlr::train(logreg_bagged_featsel.lrn, classif.task)
[FeatSel] Started selecting features for learner 'classif.logreg.bagged'
With control class: FeatSelControlSequential
Imputation value: 1
[FeatSel-x] 1: 111 (3 bits)
[FeatSel-y] 1: mmce.test.mean=0.447; time: 0.0 min
[FeatSel-x] 2: 011 (2 bits)
[FeatSel-y] 2: mmce.test.mean=0.509; time: 0.0 min
[FeatSel-x] 2: 101 (2 bits)
[FeatSel-y] 2: mmce.test.mean=0.448; time: 0.0 min
[FeatSel-x] 2: 110 (2 bits)
[FeatSel-y] 2: mmce.test.mean=0.456; time: 0.0 min
[FeatSel-x] 3: 001 (1 bits)
[FeatSel-y] 3: mmce.test.mean=0.51; time: 0.0 min
[FeatSel-x] 3: 100 (1 bits)
[FeatSel-y] 3: mmce.test.mean=0.468; time: 0.0 min
[FeatSel] Result: ac (2 bits)
Model for learner.id=classif.logreg.bagged.featsel; learner.class=FeatSelWrapper
Trained on: task.id = classif.df; obs = 1000; features = 3
Hyperparameters: model=FALSE
How can I make this work for sequential forward search? Thanks.
Sequential forward search starts with an empty model, i.e. no features. This isn't supported by the bagging wrapper. I've opened an issue for this here.