Search code examples
pythonmachine-learningscikit-learnclassification

Save classifier to disk in scikit-learn


How do I save a trained Naive Bayes classifier to disk and use it to predict data?

I have the following sample program from the scikit-learn website:

from sklearn import datasets
iris = datasets.load_iris()
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
y_pred = gnb.fit(iris.data, iris.target).predict(iris.data)
print "Number of mislabeled points : %d" % (iris.target != y_pred).sum()

Solution

  • Classifiers are just objects that can be pickled and dumped like any other. To continue your example:

    import cPickle
    # save the classifier
    with open('my_dumped_classifier.pkl', 'wb') as fid:
        cPickle.dump(gnb, fid)    
    
    # load it again
    with open('my_dumped_classifier.pkl', 'rb') as fid:
        gnb_loaded = cPickle.load(fid)