Search code examples
google-kubernetes-enginegke-networking

Multi-zonal GKE cluster for batch processing


I am batch processing data using auto-scaling preemptible nodes on a GKE zonal cluster. Every now and then, GPUs become scarce. Rather than switching zones to chase GPUs (which I've already done), I've tried changing to a multi-zonal configuration. From my point of view, things seem to be working OK on some light- to medium-scale workloads.

I see warnings in the UI about unbalanced node pools, as the node pools seem to be scaling up in zones where there are available resources. Is this warning serious? What are the ramifications of different node numbers in different zones? Should I instead run separate pools per zone?

I have a fair amount of communication between nodes -- how much is my bandwidth impacted by workers being in separate zones? The GKE docs indicate no ingress limitation, and only that egress is slower than within-zone and faster than between-region.


Solution

  • As per the Bandwidth summary table, there is no limitation on ingress and with respect to egress, the bandwidth connectivity between your nodes deployed in multi-zone is slightly lower compared to connectivity within a zone.

    Cluster autoscaler only balance across zones during a scale-up event. Cluster autoscaler scales down underutilized nodes regardless of the relative sizes of underlying managed instance groups in a node pool which can cause the nodes to be distributed unevenly across zones.

    If you specify a minimum of zero nodes, an idle node pool can scale down completely. However, at least one node must always be available in the cluster to run system Pods.

    Refer to link for more information about balanced node groups.