Search code examples
rrpartmlr

Tuning parms in rpart with MLR package?


I am trying to use the MLR package to tune the hyper-parameters of a decision tree built with the rpart package. Even if I can tune the basic parameters of the decision tree (e.g. minsplit, maxdepth and so on), I am not able to properly set the values of the parameter param. Specifically, I would like to try different priors in the grid search.

Here the code I written (dat is the dataframe I am using, and target is my class variable):

# Create a task
dat.task = makeClassifTask(id = "tree", data = dat, target = "target")
# Define the model
resamp = makeResampleDesc("CV", iters = 4L)
# Create the learner
lrn = makeLearner("classif.rpart")
# Create the grid params
control.grid = makeTuneControlGrid() 
ps = makeParamSet(
     makeDiscreteParam("cp", values = seq(0.001, 0.006, 0.002)),
     makeDiscreteParam("minsplit", values = c(1, 5, 10, 50)),
     makeDiscreteParam("maxdepth", values = c(20, 30, 50)),
     makeDiscreteParam("parms", values = list(prior=list(c(.6, .4), 
                                                         c(.5, .5))))
)

When I try to execute the tuning, with:

# Actual tuning, with accuracy as evaluation metric
tuned = tuneParams(lrn, task = dat.task, 
                   resampling = resamp, 
                   control = control.grid, 
                   par.set = ps, measures = acc)

I get the error

Error in get(paste("rpart", method, sep = "."), envir = environment())(Y, : The parms list must have names

I also tried to define parms as an UntypedParam with

makeUntypedParam("parms", special.vals = list(prior=list(c(.6, .4), c(.5,.5))))

This was because by typing getParamSet("classif.rpart"), it seems to me that the tuning accepts an "untyped variable" rather than a discrete one.

However, when I try this, I get the error:

Error in makeOptPath(par.set, y.names, minimize, add.transformed.x, include.error.message,  : 
  OptPath can currently only be used for: numeric,integer,numericvector,integervector,logical,logicalvector,discrete,discretevector,character,charactervector

Can anybody help?


Solution

  • You have to define the Parameter "parms" like this:

    makeDiscreteParam("parms", values = list(a = list(prior = c(.6, .4)), b = list(prior = c(.5, .5))))
    

    a and b can be arbitrary names that just reflect what the actual value says.