Search code examples
pythonpython-2.7machine-learningsklearn-pandas

Best practice to "transport" trained model from sklearn


import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model


# Create linear regression object
regr = linear_model.LinearRegression()

# Train the model using the training sets
regr.fit(X_train, y_train)
# how save ?????
# save here

What the best practice to save the trained model and use in other place?


Solution

  • sklearn has a joblib module for persisting models and/or saving to a file:

    from sklearn.externals import joblib
    
    joblib.dump(regr, 'file_name.pkl')
    
    # load pickled model later
    regr = joblib.load('file_name.pkl') 
    

    You can also use Python's builtin pickle but the docs recommend to use joblib for efficiently pickling objects with large numpy arrays