Search code examples
rshinymlr3

Accessing probabilities for PipeOps TwoClass Classif Learners


I am currently working on mlr3shiny. This program utilizes various mlr3 methods on an R shiny UI to make mlr3-models. I am currently unable to properly integrate the learner-objects to work with DALEX for model analysis. This problem arises only for twoclass models, as the pipeops turns the learners into predict_type = "response". I cannot change this predict_type once it has been made, as I get the following error if I try to change it:

PipeOpError: $predict_type for PipeOpThreshold is read-only.

Due to these response-type learners, the learner does not put out probabilities I need for further analysis with DALEX

This is the function that is used to initialize the learner with twoclass learners being set to predict_type = 'response' as this is the default

createGraphLearner <- function(selectedlearner) {
  if (!isTRUE(currenttask$task$properties == "twoclass")) {
    learner <- lrn(input[[selectedlearner]]) 
  } else { # ...otherwise predict_type = "prob" is set and a threshold po added below
    learner <- lrn(input[[selectedlearner]], predict_type = "prob")
  }
  if(input[["Task_robustify"]]){
    graph <- pipeline_robustify(currenttask$task, learner) %>>% learner
  } else graph <- as_graph(po("learner", learner))
  plot(graph)  
  if (isTRUE(currenttask$task$properties == "twoclass")) graph <- graph %>>% po("threshold")
  
  return(as_learner(graph))
}

These learners can worked with in various ways, but I have not found a way to access the probabilities of the trained model once it has been trained on the final dataset.


Solution

  • As far as I understand your problem, you want to get the probability prediction of a GraphLearner, whose final PipeOP is a PipeOpThreshold?

    The answer might be unexpected. Even though PipeOpThreshold claims to have the predict_type "response", it actually outputs probabilities as well (this is a bug, that I reported here).

    Fortunately for you this means that you don't have to change the learner's predict type, but still have access to the probabilities :)

    Btw: Awesome that you are working on mlr3shiny! :) If you wanna get in touch you can join our mattermost channel that you can find on our website or on the README of the mlr-org on GitHub.

    library(mlr3verse)
    #> Loading required package: mlr3
    
    l = lrn("classif.rpart", predict_type = "prob")
    task = tsk("sonar")
    
    glrn = as_learner(
      po("learner", l) %>>%
        po("threshold"))
    
    glrn$train(task)
    glrn$predict_type
    #> [1] "response"
    
    p = glrn$predict(task)
    
    p
    #> <PredictionClassif> for 208 observations:
    #>     row_ids truth response    prob.M    prob.R
    #>           1     R        R 0.1060606 0.8939394
    #>           2     R        M 0.7333333 0.2666667
    #>           3     R        R 0.0000000 1.0000000
    #> ---                                           
    #>         206     M        M 0.9250000 0.0750000
    #>         207     M        M 0.9250000 0.0750000
    #>         208     M        M 0.9250000 0.0750000
    

    Created on 2023-02-28 with reprex v2.0.2