Search code examples
google-cloud-platformgoogle-ai-platform

GCP AI Platform Unified - AutoScaling


In the documentation of GCP AI Platform Unified it says:

AI Platform scales your nodes based on CPU usage even if you have configured your prediction nodes to use GPUs; therefore if your prediction throughput is causing high GPU usage, but not high CPU usage, your nodes might not scale as you expect

How do we scale based on GPU usage?


Solution

    1. It seems that AI Platform legacy is able to do that [1] but it's also in preview and looks like this feature is not yet added to the AI Platform Unified.
    2. You can look into the AI Platform Unified release notes updates [2] to check update about this feature

    [1]https://cloud.google.com/ai-platform/prediction/docs/machine-types-online-prediction#specifying_gpus

    [2]https://cloud.google.com/ai-platform-unified/docs/resources/release-notes