pytorch google-cloud-vertex-ai torchserve

Google Vertex AI Prediction: Why is TorchServe showing 0 GPUs?

I have deployed a trained PyTorch model to a Google Vertex AI Prediction endpoint. The endpoint is working fine, giving me predictions, but when I examine its logs in Logs Explorer, I see:

INFO 2023-01-11T10:34:53.270885171Z Number of GPUs: 0

INFO 2023-01-11T10:34:53.270888834Z Number of CPUs: 4

This is despite the fact that I set the endpoint to use NVIDIA_TESLA_T4 as the accelerator type:

Why does the log show 0 GPUs and does this mean TorchServe is not taking advantage of the accelerator GPU?

Solution

This is a common problem with PyTorch and CUDA. GPU support is only enabled when the right version of PyTorch is installed, i.e. one which compiles for CUDA. So it’s recommended that you use images which have PyTorch's CUDA capabilities.