I am trying to use the MLR package to tune the hyper-parameters of a decision tree built with the rpart package. Even if I can tune the basic parameters of the decision tree (e.g. minsplit
, maxdepth
and so on), I am not able to properly set the values of the parameter param
. Specifically, I would like to try different priors
in the grid search.
Here the code I written (dat
is the dataframe I am using, and target
is my class variable):
# Create a task
dat.task = makeClassifTask(id = "tree", data = dat, target = "target")
# Define the model
resamp = makeResampleDesc("CV", iters = 4L)
# Create the learner
lrn = makeLearner("classif.rpart")
# Create the grid params
control.grid = makeTuneControlGrid()
ps = makeParamSet(
makeDiscreteParam("cp", values = seq(0.001, 0.006, 0.002)),
makeDiscreteParam("minsplit", values = c(1, 5, 10, 50)),
makeDiscreteParam("maxdepth", values = c(20, 30, 50)),
makeDiscreteParam("parms", values = list(prior=list(c(.6, .4),
c(.5, .5))))
)
When I try to execute the tuning, with:
# Actual tuning, with accuracy as evaluation metric
tuned = tuneParams(lrn, task = dat.task,
resampling = resamp,
control = control.grid,
par.set = ps, measures = acc)
I get the error
Error in get(paste("rpart", method, sep = "."), envir = environment())(Y, : The parms list must have names
I also tried to define parms
as an UntypedParam
with
makeUntypedParam("parms", special.vals = list(prior=list(c(.6, .4), c(.5,.5))))
This was because by typing getParamSet("classif.rpart")
, it seems to me that the tuning accepts an "untyped variable" rather than a discrete one.
However, when I try this, I get the error:
Error in makeOptPath(par.set, y.names, minimize, add.transformed.x, include.error.message, :
OptPath can currently only be used for: numeric,integer,numericvector,integervector,logical,logicalvector,discrete,discretevector,character,charactervector
Can anybody help?
You have to define the Parameter "parms"
like this:
makeDiscreteParam("parms", values = list(a = list(prior = c(.6, .4)), b = list(prior = c(.5, .5))))
a
and b
can be arbitrary names that just reflect what the actual value says.