Search code examples
pythonscikit-learnpipelineneuraxle

Using predict_proba() instead predict() in Neuraxle Pipeline with OneVsRestClassifier


I'm trying to setup a Neuraxle Pipeline that uses sklearns OneVsRestClassifier (OVR).

Every valid step in a Neuraxle pipeline has to implement the fit() and transform() methods.

In order to use sklearns pipeline steps, Neuraxle uses a SKLearnWrapper that maps the OVRs predict() method to the transform() method of the SKLearnWrapper.

  1. Is there a way to modify this behavior so that the predict_proba() method is mapped to the OVRs transform() method instead?

  2. Or is there another way of retrieving the calculated probabilities?


Solution

  • Very good question !

    We already have something to solve this.

    Suppose you code a class like this:

    class MyWrapper(BaseStep): 
    
        def transform(self, data_inputs): 
            return sigmoid(data_inputs)
    
        def predict_proba(self, data_inputs): 
            return data_inputs
    

    You could do as follow:

    step = MyWrapper()
    

    Then, once you're ready to replace the method, use Neuraxle's mutate function:

    step = step.mutate(new_method='predict_proba', method_to_assign_to='transform')
    

    And then, whenever .transform() will be called, the predict_proba method will be called instead. The mutate will work even if your step is wrapped (nested) deeper within other steps.

    Note that we should probably modify the sklearn wrapper to allow this. I've added the issue here: https://github.com/Neuraxio/Neuraxle/issues/368

    So until this issue is fixed, you could do class MySKLearnWrapper(SKLearnWrapper): ... (inheriting from SKLearnWrapper to modify it) and def the predict_proba by yourself like it was suggested here: https://github.com/Neuraxio/Neuraxle/pull/363/files