We try to deploy a Docker container on Google Cloud Vertex AI. However, every time we send a request to the Vertex AI, we only get a response with the model specifications but not the predictions by the model (see picture)
Herewith also the code of the app.py file we use:
class PredictionHandler(tornado.web.RequestHandler):
def __init__(
self,
application: "Application",
request: tornado.httputil.HTTPServerRequest,
**kwargs: Any
) -> None:
super().__init__(application, request, **kwargs)
def post(self):
response_body = None
try:
response_body = json.dumps({'predictions': 'Prediction ran succsessfully'})
except Exception as e:
response_body = json.dumps({'error:':str(e)})
self.set_header("Content-Type", "application/json")
self.set_header("Content-Length", len(response_body))
self.write(response_body)
self.finish()
def make_app():
tornado_app = tornado.web.Application([('/health_check', HealthCheckHandler),
('/predict', PredictionHandler)],
debug=False)
tornado_app.listen(8080)
print("Running App...")
tornado.ioloop.IOLoop.current().start()
and the code of the Dockerfile:
FROM python:3.8
RUN mkdir -p /app/model
WORKDIR /app
COPY requirements.txt /app/requirements.txt
RUN pip3 install --no-cache-dir -r requirements.txt
COPY app /app/
EXPOSE 8080
ENTRYPOINT ["python3", "app.py"]
Google expects a specific return-format being {"predictions":[...]}
and if you do not follow that format, nothing is returned (I'm almost sure).
I do not know tornado, but from your code you might need to change response_body = json.dumps({'predictions': 'Prediction ran succsessfully'})
to response_body = json.dumps({'predictions': ['Prediction ran succsessfully']})
and see if that works.