Search code examples
tensorflowgoogle-cloud-platformtpugcp-ai-platform-training

How does Google Cloud AI Platform allocate the resources in a given region, does it follow the quota restrictions?


I have a free amount of TPUs allocated for zone us-central1-a. But only for that specific zone. When I setup Platform-ai jobs I can only specify a region (us-central).

Will the platform AI pick up a random region only based on availability? Is there a way I can restrict it to the given region?


Solution

  • First, let's clarify the naming convention: us-central1-a is a zone within us-central1 region.

    It's indeed based on availability. You cannot explicitly select the zone, only the region. If your free resources are bound to a zone, one way to make sure you stay on us-central1-a is to select a TPU class that is bound to that zone. At the time of writing, this would be:

    • v2-32 256 GiB
    • v2-128 1 TiB
    • v2-256 2 TiB
    • v2-512 4 TiB

    Based on these docs. I'd check them from time to time to avoid nasty surprises. Typically though the customer service will be able to help you if you run into billing issues (you get caught in a different zone).