Search code examples
google-compute-enginegoogle-kubernetes-enginegoogle-cloud-iam

~60 requests/hour to `compute.v1.BackendServicesService.Get` only returning 404s


2023-05-16: Update

Curiously, the project logs don't include any references to these method calls. I may be overly sinking logs to /dev/null but there are other references to compute.googleapis.com/v1 but not to the backendServices.get method.

PROJECT=".."
FILTER='
log_id("cloudaudit.googleapis.com/activity")
protoPayload.serviceName="compute.googleapis.com"
protoPayload.methodName=~"v1.compute"'

gcloud logging read "${FILTER}" \
--format="value(protoPayload.methodName)" \
--project=${PROJECT} \
| sort \
| uniq
v1.compute.addresses.insert
v1.compute.firewalls.insert
v1.compute.forwardingRules.insert
v1.compute.instanceGroups.addInstances
v1.compute.instanceGroups.insert
v1.compute.instances.insert
v1.compute.projects.setCommonInstanceMetadata
v1.compute.subnetworks.patch
v1.compute.subnetworks.setPrivateIpGoogleAccess
2023-05-15: Update

Thanks to @john-hanley who clued me in on a way to identify the MYSTERY Service Account. Grep'ing the audit logs, found a few entries of the form:

logName: projects/ackal-230515/logs/cloudaudit.googleapis.com%2Factivity
protoPayload:
  ...
  request:
    '@type': type.googleapis.com/google.iam.v1.SetIamPolicyRequest
    policy:
      bindings:
      ...
      - members:
        - serviceAccountId:112852397007451968863
        role: roles/container.serviceAgent

And there's only one member in the Project's Policy binding that uses this role:

gcloud projects get-iam-policy ${PROJECT} \
--flatten="bindings[].members" \
--filter="bindings.role=\"roles/container.serviceAgent\""

Yields:

bindings:
  members: serviceAccount:service-{number}@container-engine-robot.iam.gserviceaccount.com
  role: roles/container.serviceAgent

So, I know what Service Account is being used and I know it's related to Kubernetes Engine but I don't understand why this Service Account is making unnecessary method calls.

Original Question

I have been reviewing serviceruntime.googleapis.com/api/request_count for consumed_api in an effort to audit a Project's Service Accounts.

Recently, I swapped a Kubernetes Engine (GKE) cluster's nodes' Service Account from the Default Compute Engine account to a User-managed account with role roles/container.nodeServiceAccount and I am trying to ensure that there are no failed method calls by this Service Account.

Among the non-200 response code results are ~60/hour calls to compute.v1.BackendServicesService.Get. There are only 404s against this method.

Questions:

  1. How can I determine the Unique ID for Google-managed Service Accounts?
  2. What could be using this Service Account to make these calls?
  3. Why is Kubernetes Engine using this Service Account to make thousands of unnecessary methods calls?

I'm using a myriad of Google Cloud services (Cloud Run, Kubernetes Engine, etc.) but no Load Balancers and the project does not contain any Backend services:

gcloud compute backend-services list \
--project=${PROJECT}
Listed 0 items.

The calls are all made by a Service Account that I'm unable to identify: 100678112478450061433.

It's not the ID of one of the Project's Service Accounts:

PROJECT="..." # Project ID
MYSTERY="100678112478450061433"

gcloud iam service-accounts list \
--project=${PROJECT} \
--format="value(uniqueId)" \
| grep ${MYSTERY}

And it's not the ID of an describe'able Service Account in the Project's IAM binding:

PROJECT="..." # Project ID
MYSTERY="100678112478450061433"

EMAILS=$(\
  gcloud projects get-iam-policy ${PROJECT} \
  --flatten="bindings[].members" \
  --filter="bindings.members~\"serviceAccount:*\"" \
  --format="value(bindings.members.split(sep=\":\").slice(1:))" \
  | uniq | sort)

for EMAIL in ${EMAILS}
do
  printf "%s: " ${EMAIL}
  ID=$(\
    gcloud iam service-accounts describe ${EMAIL} \
    --format="value(uniqueId)" \
    2>/dev/null)
  if [ -z "${ID}" ]
  then
    echo "Inaccessible"
    continue
  fi
  if [ "${ID}" = "${MYSTERY}" ]
  then
    echo "Found!"
    break
  else
    echo "No match"
  fi
done

I suspect it is one of the Google-managed Service Accounts, but I don't know how to find the Unique ID for these:

{NUMBER}@cloudbuild.gserviceaccount.com
{NUMBER}@cloudservices.gserviceaccount.com
service-{NUMBER}@compute-system.iam.gserviceaccount.com
service-{NUMBER}@container-engine-robot.iam.gserviceaccount.com
service-{NUMBER}@containerregistry.iam.gserviceaccount.com
service-{NUMBER}@firebase-rules.iam.gserviceaccount.com
service-{NUMBER}@gcf-admin-robot.iam.gserviceaccount.com
service-{NUMBER}@gcp-sa-artifactregistry.iam.gserviceaccount.com
service-{NUMBER}@gcp-sa-cloudbuild.iam.gserviceaccount.com
service-{NUMBER}@gcp-sa-cloudscheduler.iam.gserviceaccount.com
service-{NUMBER}@gcp-sa-firestore.iam.gserviceaccount.com
service-{NUMBER}@gcp-sa-pubsub.iam.gserviceaccount.com
service-{NUMBER}@serverless-robot-prod.iam.gserviceaccount.com

If I can identify the Service Account, I'm closer to understanding the cause.


Solution

  • The container-engine-robot service account is used to manage the lifecycle of GCE resources (e.g. nodes, disks, load balancers) used by GKE.

    I believe the backendServices.get call is coming from the Ingress controller which is periodically checking to make sure it has the proper IAM permissions to manage load balancer resources.

    Looking at https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#implementation_details :

    The Ingress controller performs periodic checks of service account permissions by fetching a test resource from your Google Cloud project. You will see this as a GET of the (non-existent) global BackendService with the name k8s-ingress-svc-acct-permission-check-probe

    I took a look at the implementation and apparently it expects to receive a 404 (as the resource should not exist) and considers that a successful check.