I'm building a decision tree with rpart via the caret::train function. What I'm trying to do is to set the minsplit parameter of rpart equal to 1, in order to prune it afterwards with the cp. What I get from here is that the parameters should be passed in the ... of the train function. But this doesn't work. A minimal reproducible example:
mod1 <- train(Species ~ ., iris, method = "rpart", tuneGrid = expand.grid(cp = 0), minsplit=1)
mod2 <- rpart(Species ~ ., iris, cp=0, minsplit=1)
What I get is that mod1$finalModel and mod2 are quite different. I would like that mod1$finalModel was like mod2 (i.e., totally overfitted). I cannot pass the parameter either on the tuneGrid since it only accepts a cp column.
So my question is: is there anyway in caret to pass the argument minsplit=1 in the train function and then cross validate over the cp parameter?
Ok, thanks to this post I figured out how to do it:
mod1 <- train(Species ~ ., iris, method = "rpart",
control = rpart.control(minsplit = 1, minbucket = 1))
I'm still not quite sure why the argument has to be passed via control = rpart.control(). Passing just the arguments minsplit = 1, minbucket = 1 directly to the train function simply doesn't work.