Search code examples
pythonmachine-learningclassificationnltknaivebayes

Save Naive Bayes Trained Classifier in NLTK


I'm slightly confused in regard to how I save a trained classifier. As in, re-training a classifier each time I want to use it is obviously really bad and slow, how do I save it and the load it again when I need it? Code is below, thanks in advance for your help. I'm using Python with NLTK Naive Bayes Classifier.

classifier = nltk.NaiveBayesClassifier.train(training_set)
# look inside the classifier train method in the source code of the NLTK library

def train(labeled_featuresets, estimator=nltk.probability.ELEProbDist):
    # Create the P(label) distribution
    label_probdist = estimator(label_freqdist)
    # Create the P(fval|label, fname) distribution
    feature_probdist = {}
    return NaiveBayesClassifier(label_probdist, feature_probdist)

Solution

  • To save:

    import pickle
    f = open('my_classifier.pickle', 'wb')
    pickle.dump(classifier, f)
    f.close()
    

    To load later:

    import pickle
    f = open('my_classifier.pickle', 'rb')
    classifier = pickle.load(f)
    f.close()