Returning standard deviation with `BaggingRegressor`

Is there a way to return standard deviation using sklearn.ensemble.BaggingRegressor?

Cause by looking at several examples all that I have found has been the mean prediction.

Solution

You can always get the underlying predictions by each estimator of the ensemble, which (estimator) is accessible through the estimators_ attribute of the ensemble, and handle these predictions accordingly (compute mean, standard deviation, etc).

Adapting the example from the documentation, with an ensemble of 10 SVR base estimators:

import numpy as np
from sklearn.svm import SVR
from sklearn.ensemble import BaggingRegressor
from sklearn.datasets import make_regression

X, y = make_regression(n_samples=100, n_features=4,
                       n_informative=2, n_targets=1,
                       random_state=0, shuffle=False)
regr = BaggingRegressor(base_estimator=SVR(),
                        n_estimators=10, random_state=0).fit(X, y)


regr.predict([[0, 0, 0, 0]]) # get (mean) prediction for a single sample, [0, 0, 0, 0]
# array([-2.87202411])

# get the predictions from each individual member of the ensemble using a list comprehension:

raw_pred = [x.predict([[0, 0, 0, 0]]) for x in regr.estimators_]
raw_pred
# result:
[array([-2.13003431]),
 array([-1.96224516]),
 array([-1.90429596]),
 array([-6.90647796]),
 array([-6.21360547]),
 array([-1.84318744]),
 array([1.82285686]),
 array([4.62508622]),
 array([-5.60320499]),
 array([-8.60513286])]

# get the mean, and ensure that it is the same with the one returned above with the .predict method of the ensemble:

np.mean(raw_pred)
# -2.8720241079257436
np.mean(raw_pred) == regr.predict([[0, 0, 0, 0]]) # sanity check
# True

# get the standard deviation:
np.std(raw_pred)
# 3.865135037828279