Search code examples
google-cloud-platformtpugoogle-cloud-tpu

Can't create a TPU node/VM since March 4


Since some time around March 4, suddenly I have not been able to create a Cloud TPU node.

When I attempt to create a TPU node/VM via GUI, it crashes upon choosing TPU type with any region. I get tons of JS errors in the console:

ERROR TypeError: Cannot read properties of undefined (reading 'CP-CLOUD-TPU-V3')
m=b:90 ERROR TypeError: Cannot read properties of undefined (reading 'CP-CLOUD-TPU-V3')
m=b:90 ERROR TypeError: Cannot read properties of undefined (reading 'CP-CLOUD-TPU-V3')
m=b:90 ERROR TypeError: Cannot read properties of undefined (reading 'CP-CLOUD-TPU-V3')
m=b:90 ERROR TypeError: Cannot read properties of undefined (reading 'CP-CLOUD-TPU-V3')

Attempting to create a TPU VM from Cloud Shell results in error code 13 with combinations of any zone or version:

gcloud alpha compute tpus tpu-vm create testnode --zone us-central1-a --accelerator-type='v3-8' --version='v2-alpha' --scopes='cloud-platform'
ERROR: (gcloud.alpha.compute.tpus.tpu-vm.create) {
  "code": 13,
  "message": "an internal error has occurred"
}

What I tested:

  1. Attempting the same procedure with a different project - same behavior and error.
  2. Attempting the same procedure with a new account that never used Cloud TPU before - same behavior and error.
  3. Using Chrome from an Android phone with mobile network - same behavior and error.
  4. Quotas are fine.

I figured google-cloud-tpu 1.3.2 was released March 8, but I am not sure if that is related to the issue I am getting.

Other parts of GCP, such as VM instances or Cloud Storage work fine - just TPU has been down for me.


Solution

  • You can try this:

     gcloud alpha compute tpus tpu-vm create testnode 
       --zone us-central1-a --accelerator-type='v3-8' --version='v2-alpha' 
       --scopes=https://www.googleapis.com/auth/cloud-platform
    

    The short form --scopes='cloud-platform' is not supported for tpus.