Search code examples
google-cloud-platformterraformgoogle-kubernetes-engineterraform-provider-gcpinfrastructure-as-code

How to Properly Change a Google Kubernetes Engine Node Pool Using Terraform?


I have successfully created a Google Kubernetes Engine (GKE) cluster ($GKE_CLUSTER_NAME) inside of a Google Cloud Platform (GCP) project ($GCP_PROJECT_NAME):

gcloud container clusters list \
--format="value(name)" \
--project=$GCP_PROJECT_NAME

#=>

. . .
$GKE_CLUSTER_NAME
. . .

which uses the node pool $GKE_NODE_POOL:

gcloud container node-pools list \
--cluster=$GKE_CLUSTER_NAME \
--format="value(name)" \
--zone=$GKE_CLUSTER_ZONE

#=>

$GKE_NODE_POOL

I am checking this config. into SCM using Terraform with the following container_node_pool.tf:

resource "google_container_node_pool" ". . ." {
  autoscaling {
    max_node_count = "3"
    min_node_count = "3"
  }

  . . .

  initial_node_count = "3"

  . . .

}

and I confirmed that the Terraform configuration above matched $GKE_NODE_POOL running currently inside of $GKE_CLUSTER_NAME and $GCP_PROJECT_NAME:

terraform plan

#=>

No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.

If I want to make a change to $GKE_NODE_POOL:

resource "google_container_node_pool" ". . ." {
  autoscaling {
    max_node_count = "4"
    min_node_count = "4"
  }

  . . .

  initial_node_count = "4"

  . . .

}

and scale the number of nodes in $GKE_NODE_POOL from 3 to 4, I get the following output when trying to plan:

terraform plan

#=>

. . .

Plan: 1 to add, 0 to change, 1 to destroy.

. . .

How can I update $GKE_NODE_POOL without destroying and then recreating the resource?


Solution

  • Changing the initial_node_count argument for any google_container_node_pool will trigger destruction and recreation. Just don't modify initial_node_count and you should be able to modify $GKE_NODE_POOL arguments such as min_node_count and max_node_count.

    The output of the plan command should explicitly show you which argument causes destruction and recreation behavior [in red]:

    terraform plan
    
    . . .
    
    Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
    -/+ destroy and then create replacement
    
    Terraform will perform the following actions:
    
      # google_container_node_pool.$GKE_NODE_POOL must be replaced
    -/+ resource "google_container_node_pool" ". . ." {
    
    . . .
    
          ~ initial_node_count  = 3 -> 4 # forces replacement
    
    . . .
    
    Plan: 1 to add, 0 to change, 1 to destroy.
    
    . . .
    

    The initial_node_count argument seems to be the only argument for google_container_node_pool that causes this behavior; the initial_node_count argument also appears to be optional.

    You can read this warning in the official documentation here.