My question is similar to this.
I also use pickle to save & load model. I meet the below error during pickle.load( )
from sklearn.preprocessing import StandardScaler
# SAVE
scaler = StandardScaler().fit(X_train)
X_trainScale = scaler.transform(X_train)
pickle.dump(scaler, open('scaler.scl','wb'))
# =================
# LOAD
sclr = pickle.load(open('scaler.scl','rb')) # => ModuleNotFoundError: No module named 'sklearn.preprocessing._data'
X_testScale = sclr.transform(X_test)
ModuleNotFoundError: No module named 'sklearn.preprocessing._data'
It looks like a sklearn version issue. My sklearn version is 0.20.3, Python version is 3.7.3.
But I am using Python in an Anaconda .zip file. Is it possible to solve this without updating the version of sklearn?
I had exactly the same error message with StandardScaler using Anaconda.
Fixed it by running:
conda update --all
I think the issue was caused by running the pickle dump for creating the scaler file on a machine with a newer version of scikit-learn, and then trying to run pickle load on machine with an older version of scikit-learn. (It gave the error when running pickle load on the machine with the older version of scikit-learn but no error when running pickle load on the machine with the newer version of scikit-learn. Both windows machines). Perhaps this is due to more recent versions using a different naming convention for functions regarding underscores (as mentioned above)?
Anaconda would not let me update the scikit-learn library on it's own, because it claimed it required the older version (for some reason I could not understand). Perhaps another library was using it? So I had to fix it by updating all the libraries at the same time, which worked.