I have trained an LGBM classifier and can run this in a local notebook using:
model = jlb.load('fnol-v13-pipeline-high-g-analysis-f1-focal-retrain-{}.pkl'.format(model_version))
probs = model.predict_proba(x_test)
probs = [x[0] for x in probs]
However, in a Flask API running in a Docker container using gunicorn, the line probs = model.predict_proba(x_test)
causes [CRITICAL] WORKER TIMEOUT
, without raising an error. I have tried printing out the x_test
dataframe as a dictionary directly before this line is run, and when I reload this as a dataframe in my notebook and run the predict function, the scores are returned as normal. Copying this same dictionary back into the container still causes the timeout. The model is loaded from the same file.
Any idea what could be causing the issue or how can I debug this? I have set gunicorn log-level=debug
but this still only shows the timeout. I presume it is a problem with the LGBM model but it only seems to occur within the container.
Update: I removed the gunicorn wrapper, and the Flask app runs without issue, so I presume there is some compatibility issue between gunicorn and LGBM?
Further update: I switched to uWSGI and it runs fine, but I'll leave this open in case anyone knows what went wrong with gunicorn.
I eventually found that all I needed to do was add --worker-class=gthread
to my gunicorn entrypoint.