Plotting Partial Dependence Plots in R for binary target (mlr)

I have a problem to get partial dependence plots with mlr to work properly for me. Somehow not the probability is plottet, but just the class label. I suspect, that the target may be lost during the creation of der partialdependence-data.

Any ideas?


# select subset
iris_bin <- iris %>% 
  filter(Species != "virginica") %>% 
  mutate(bin_target = ifelse(Species == "setosa", TRUE, FALSE)) %>% 

# fit model
task_bin <- makeClassifTask(data = iris_bin, target = "bin_target")
lrn_bin  <- makeLearner("classif.ranger", predict.type = "prob")
fit_bin <- train(lrn_bin, task_bin)

# create partial dependence plot
pd <- generatePartialDependenceData(fit_bin, task_bin, "Sepal.Length")

pd  # is the target correct?
#> PartialDependenceData
#> Task: iris_bin
#> Features: Sepal.Length
#> Target: FALSE
#> Derivative: FALSE
#> Interaction: FALSE
#> Individual: FALSE
#>        FALSE Sepal.Length
#> 1: 0.4920347          4.3
#> 2: 0.4920347          4.6
#> 3: 0.4935947          4.9
#> 4: 0.4945947          5.2
#> 5: 0.5104600          5.5
#> 6: 0.5107800          5.8
#> ... (#rows: 10, #cols: 2)

enter image description here

This would the details of my current session, maybe this helps?:

  • Hopefully the mlr package maintainers can help (I don't use that package). However, in the meantime, you can fit the model directly, and just use the pdp package:

    fit <- ranger(as.factor(bin_target) ~ ., data = iris_bin, 
                  probability = TRUE)
    pd <- partial(fit, pred.var = "Sepal.Length", prob = TRUE)

    Note the use of prob = TRUE in the call to partial. Also, ggplot2 is not necessary as you can just use plotPartial(pd) instead, which relies on lattice graphics.

    Also, you can still fit the model with mlr and then use partial on that; for instance,

    # select subset
    iris_bin <- iris %>% 
      filter(Species != "virginica") %>% 
      mutate(bin_target = ifelse(Species == "setosa", TRUE, FALSE)) %>% 
    # fit model
    task_bin <- makeClassifTask(data = iris_bin, target = "bin_target")
    lrn_bin  <- makeLearner("classif.ranger", predict.type = "prob")
    fit_bin <- train(lrn_bin, task_bin)
    # partial dependence plot
    mod <- getLearnerModel(fit_bin)  # EXTRACT THE MODEL!!  <<--
    partial(mod, pred.var = "Sepal.Length", prob = TRUE, 
            plot = TRUE, train = iris_bin)

    Note, however, the need to supply the original training data via the train argument.