Search code examples
google-cloud-platformpytorchgoogle-cloud-mlgoogle-cloud-vertex-aitorchserve

Deployment with customer handler on Google Cloud Vertex AI


I'm trying to deploy a TorchServe instance on Google Vertex AI platform but as per their documentation (https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#response_requirements), it requires the responses to be of the following shape:

{
  "predictions": PREDICTIONS
}

Where PREDICTIONS is an array of JSON values representing the predictions that your container has generated.

Unfortunately, when I try to return such a shape in the postprocess() method of my custom handler, as such:

def postprocess(self, data):
    return {
        "predictions": data
    }

TorchServe returns:

{
  "code": 503,
  "type": "InternalServerException",
  "message": "Invalid model predict output"
}

Please note that data is a list of lists, for example: [[1, 2, 1], [2, 3, 3]]. (Basically, I am generating embeddings from sentences)

Now if I simply return data (and not a Python dictionary), it works with TorchServe but when I deploy the container on Vertex AI, it returns the following error: ModelNotFoundException. I assumed Vertex AI throws this error since the return shape does not match what's expected (c.f. documentation).

Did anybody successfully manage to deploy a TorchServe instance with custom handler on Vertex AI?


Solution

  • Actually, making sure that the TorchServe processes correctly the input dictionary (instances) solved the issue. It seems like what's on the article did not work for me.