We have a Cloud Workflows orchestrating few Cloud Batch jobs, on of them running for hours. After few weeks running, today, we started to have problems.
Creating the batch, we are getting ZONE_RESOURCE_POOL_EXHAUSTED
errors. First though was about machines availability in that zone, but moving to from us-central1
to europe-west3
did not fix the problem.
Then, doing investigating a little bit, we realised that we are at 96% Persistent Disk SSD (GB)
for the zone us-central1
, as the attached picture shows.
First question: May the current usage quote block the instantiation of new machines? If yes, why the error says that there is a problems with zone?
Second question: How to know what is consuming this quota. We are not aware of using this amount of SSD disks. Could be a Google Cloud Shell that one person is using daily to run few jobs manually?
Notes:
Any advice will be welcome. Thanks!
After 6h and looks like the graph and quota value is not updated.
Looks like it was a problem in the GCP side. I was able to create a virtual machine with a small disk of 30GB. Just after that, all graphs are fine and moved from 480GB to 30GB as expected.
After deleting the VM, no usage and I was able to use it with no problems.
This looks not good to me and I'm concern about how often this will occurs.