Search code examples
pythontextblob

How to persist model for Python TextBlob?


How to persist model results for next update later in TextBlob?

The documentation reference can be found here https://github.com/sloria/TextBlob

I noticed the documentation specified how to update the training data but I did not see a method or way to save data from a last session.

how to update: https://textblob.readthedocs.io/en/dev/classifiers.html#updating-classifiers-with-new-data

In particular I'm referring to classifying text. I do feel I am dumb in this particular topic as I always find it difficult to know where these training sessions are being persisted in any AI examples.

You don't want to run the whole thing again right? You want to start where you left off and keep improving it iteratively.

I want to do this:

  1. If past training results exists, load them into the model
  2. Update or run new training session
  3. Save training session
  4. Repeat at a later time as needed

Solution

  • The models and training can be persisted using pickling and unpickling.

    >>> from textblob.classifiers import NaiveBayesClassifier
    >>> train = [('love the weather','pos'),('love the world','pos'),('horrible place','neg')]
    >>> cl = NaiveBayesClassifier(train)
    >>> [cl.prob_classify("love food").prob('pos'),cl.prob_classify("love food").prob('neg')]
    [0.8590880780051973, 0.14091192199480246]
    >>> import cPickle
    >>> save_training = open('/tmp/save_training.pickle','wb')
    >>> cPickle.dump(cl,save_training)  # SAVE TRAINED CLASSIFIER
    >>> save_training.close()
    >>> 
    >>> load_training = open('/tmp/save_training.pickle','rb')
    >>> new_cl = cPickle.load(load_training) # LOAD TRAINED CLASSIFIER
    >>> [new_cl.prob_classify("love food").prob('pos'),new_cl.prob_classify("love food").prob('neg')]
    [0.8590880780051973, 0.14091192199480246]