python python-2.7 machine-learning sklearn-pandas

Best practice to "transport" trained model from sklearn

import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model


# Create linear regression object
regr = linear_model.LinearRegression()

# Train the model using the training sets
regr.fit(X_train, y_train)
# how save ?????
# save here

What the best practice to save the trained model and use in other place?

Solution

sklearn has a joblib module for persisting models and/or saving to a file:

from sklearn.externals import joblib

joblib.dump(regr, 'file_name.pkl')

# load pickled model later
regr = joblib.load('file_name.pkl')

You can also use Python's builtin pickle but the docs recommend to use joblib for efficiently pickling objects with large numpy arrays