I'm using scikit-learn's Perceptron algorithm to do binary classification. When using some of the other algorithms in the library (RandomForestClassifer, LogisticRegression, etc.), I can use model.predict_proba()
to have the algorithm output the probability of getting a positive (1) for each example. Is there a way to get a similar output for the Perceptron algorithm?
The closest I've been able to come is model.decision_function()
, which outputs a confidence score for the example based on the signed distance to the hyperplane, but I'm not sure how to convert these confidence scores to the probability figures I want.
model.predict()
is also only returning binary values.
I think what you want is CalibratedClassifierCV
:
from sklearn import linear_model
from sklearn.datasets import make_classification
from sklearn.calibration import CalibratedClassifierCV
# Build a classification task using 3 informative features
X, y = make_classification(n_samples=1000,
n_features=10,
n_informative=3,
n_redundant=0,
n_repeated=0,
n_classes=2,
random_state=0,
shuffle=False)
per = linear_model.Perceptron()
clf_isotonic = CalibratedClassifierCV(per, cv=10, method='isotonic')
clf_isotonic.fit(X[:900], y[:900])
preds = clf_isotonic.predict_proba(X[900:])
print preds
[Edit] You can use this also to make other linear_models
produce probabilities for classification problems