When I tried to load a pre-trained model thru joblib inside a docker container getting following error.
web_1 | 2018-02-06 15:11:50,826 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
web_1 | 2018-02-06 15:11:50,828 INFO success: uwsgi entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
web_1 | Traceback (most recent call last):
web_1 | File "./app/main.py", line 23, in <module>
web_1 | svm_detector_reloaded=joblib.load(filename);
web_1 | File "/usr/local/lib/python3.6/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 578, in load
web_1 | obj = _unpickle(fobj, filename, mmap_mode)
web_1 | File "/usr/local/lib/python3.6/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 508, in _unpickle
web_1 | obj = unpickler.load()
web_1 | File "/usr/local/lib/python3.6/pickle.py", line 1050, in load
web_1 | dispatch[key[0]](self)
web_1 | File "/usr/local/lib/python3.6/pickle.py", line 1338, in load_global
web_1 | klass = self.find_class(module, name)
web_1 | File "/usr/local/lib/python3.6/pickle.py", line 1392, in find_class
web_1 | return getattr(sys.modules[module], name)
web_1 | AttributeError: module '__main__' has no attribute 'split_into_lemmas'
web_1 | unable to load app 0 (mountpoint='') (callable not found or import error)
web_1 | *** no app loaded. going in full dynamic mode ***
web_1 | *** uWSGI is running in multiple interpreter mode ***
My main.py looks like
from flask import Flask
from flask import request
from flask import jsonify
from textblob import TextBlob
import sklearn
import numpy as np
from sklearn.externals import joblib
app = Flask(__name__)
from .api.utils import split_into_lemmas as split_into_lemmas
def split_into_lemmas(message):
message=message.lower()
words = TextBlob(message).words
# for each word, take its "base form" = lemma
return [word.lemma for word in words]
def tollower(message):
return message.lower()
filename = '../../data/sms_spam_detector.pkl'
svm_detector_reloaded=joblib.load(filename);
text="Testing"
lowerText=tollower(text)
@app.route('/')
def hello():
return tollower("Test Test ");
@app.route('/detect/')
def route_detect():
SMS=request.args.get('SMS')
if(SMS==None or SMS==''):
SMS="Test";
return tollower(SMS);
# test=[SMS]
# message= ( svm_detector_reloaded.predict(test)[0])
# return SMS+" "+message;
if __name__ == "__main__":
# Only for debugging while developing
app.run(host='0.0.0.0')
Basically I downloaded the example-flask-package-python3.6.zip from tiangolo/uwsgi-nginx-flask. Added data director and modified the docker file and main.py. main.py is pasted above and docker file looks like
FROM tiangolo/uwsgi-nginx-flask:python3.6
ENV LISTEN_PORT 8080
EXPOSE 8080
RUN pip3 install numpy TextBlob scikit-learn scipy
COPY ./app /app
COPY ./data /data
Then I copied prebuilt model (stored via joblib) to the newly created data directory. Entire code works perfectly ok, if I directly run the code like python main.py
, but not when issue docker-compose up
command, getting above error. If I comment the line svm_detector_reloaded=joblib.load(filename);
, docker comes up and everything works, except for that machine learning part.
Basically, the defined function split_into_lemmas
is not accessible inside unpickled model.
What am I doing wrong here? Model was built by following the steps mentioned @ http://radimrehurek.com/data_science_python. Actual model is built at step 6.
Ok. I am able to resolve it. I got a clue from 3614379. I first created the function split_into_lemmas
in a module (or in a .py file) and imported that module while training, instead of keeping the function in main file itself. Then in my docker instance also I imported the same module. It resolved the issue.