Search code examples
kubernetesgoogle-cloud-platformgoogle-kubernetes-enginegke-networking

How can I disable SSH on nodes in a GKE node pool?


I am running a regional GKE kubernetes cluster in is-central1-b us-central-1-c and us-central1-f. I am running 1.21.14-gke.700. I am adding a confidential node pool to the cluster with this command.

gcloud container node-pools create card-decrpyt-confidential-pool-1 \
--cluster=svcs-dev-1 \
--disk-size=100GB \
--disk-type=pd-standard \
--enable-autorepair \
--enable-autoupgrade \
--enable-gvnic \
--image-type=COS_CONTAINERD \
--machine-type="n2d-standard-2" \
--max-pods-per-node=8  \
--max-surge-upgrade=1 \
--max-unavailable-upgrade=1 \
--min-nodes=4 \
--node-locations=us-central1-b,us-central1-c,us-central1-f \
--node-taints=dedicatednode=card-decrypt:NoSchedule \
--node-version=1.21.14-gke.700 \
--num-nodes=4 \
--region=us-central1 \
--sandbox="type=gvisor" \
--scopes=https://www.googleapis.com/auth/cloud-platform \
--service-account="card-decrpyt-confidential@corp-dev-project.iam.gserviceaccount.com" \
--shielded-integrity-monitoring \
--shielded-secure-boot \
--tags=testingdonotuse \
--workload-metadata=GKE_METADATA \
--enable-confidential-nodes

This creates a node pool but there is one problem... I can still SSH to the instances that the node pool creates. This is unacceptable for my use case as these node pools need to be as secure as possible. I went into my node pool and created a new machine template with ssh turned off using an instance template based off the one created for my node pool.

gcloud compute instance-templates create card-decrypt-instance-template \
--project=corp-dev-project
--machine-type=n2d-standard-2 
--network-interface=aliases=gke-svcs-dev-1-pods-10a0a3cd:/28,nic-type=GVNIC,subnet=corp-dev-project-private-subnet,no-address
 --metadata=block-project-ssh-keys=true,enable-oslogin=true 
--maintenance-policy=TERMINATE --provisioning-model=STANDARD 
--service-account=card-decrpyt-confidential@corp-dev-project.iam.gserviceaccount.com 
--scopes=https://www.googleapis.com/auth/cloud-platform 
--region=us-central1 --min-cpu-platform=AMD\ Milan
 --tags=testingdonotuse,gke-svcs-dev-1-10a0a3cd-node 
--create-disk=auto-delete=yes,boot=yes,device-name=card-decrpy-instance-template,image=projects/confidential-vm-images/global/images/cos-89-16108-766-5,mode=rw,size=100,type=pd-standard 
--shielded-secure-boot 
--shielded-vtpm -
-shielded-integrity-monitoring 
--labels=component=gke,goog-gke-node=,team=platform --reservation-affinity=any

When I change the instance templates of the nodes in the node pool the new instances come online but they do not attach to the node pool. The cluster is always trying to repair itself and I can't change any settings until I delete all the nodes in the pool. I don't receive any errors.

What do I need to do to disable ssh into the node pool nodes with the original node pool I created or with the new instance template I created. I have tried a bunch of different configurations with a new node pool and the cluster and have not had any luck. I've tried different tags network configs and images. None of these have worked.

Other info: The cluster was not originally a confidential cluster. The confidential nodes are the first of its kind added to the cluster.


Solution

  • I needed the metadata flag when creating the node pool --metadata=block-project-ssh-keys=TRUE \

    This blocked ssh. However, enable-os-login=false won't work because it is reserved for use by the Kubernetes Engine