Search code examples
kubernetesterraformgoogle-cloud-storagegoogle-kubernetes-enginegoogle-cloud-iam

How to connect GKE to Google Object Storage


I have a node pool in GKE I have configured using Terraform:

resource "google_service_account" "kubernetes" {
    account_id = "kubernetes"
}

resource "google_container_node_pool" "general" {
    name = "general"
    cluster = google_container_cluster.primary.id
    node_count = 1

    management {
        auto_repair = true
        auto_upgrade = true

    }

    node_config {
        preemptible = false
        machine_type = "c3d-standard-8"

        labels = {
            role = "general"
        }

        service_account = google_service_account.kubernetes.email
        oauth_scopes = [
            "https://www.googleapis.com/auth/cloud-platform"
        ]

        workload_metadata_config {
            mode = "GKE_METADATA"
        }
    }
}

The cluster config is as so:

resource "google_container_cluster" "primary" {
    name = "primary"
    location = "us-central1-b"
    remove_default_node_pool = true
    initial_node_count = 1
    network = google_compute_network.main.self_link
    subnetwork = google_compute_subnetwork.private.self_link
    logging_service = "logging.googleapis.com/kubernetes"
    monitoring_service = "monitoring.googleapis.com/kubernetes"
    networking_mode = "VPC_NATIVE"

    addons_config {
        http_load_balancing {
            disabled = true
        }
        horizontal_pod_autoscaling {
            disabled = false
        }
    }

    release_channel {
        channel = "REGULAR"
    }

    workload_identity_config {
        workload_pool = "XXXXXXXXXXXX.svc.id.goog"
    }

    ip_allocation_policy {
        cluster_secondary_range_name = "k8s-pod-range"
        services_secondary_range_name = "k8s-service-range"
    }

    private_cluster_config {
        enable_private_nodes = true
        enable_private_endpoint = false
        master_ipv4_cidr_block = "172.16.0.0/28"
    }
}

I've tried to configure object storage with the following config:

resource "google_project_service" "storage_api" {
  service = "storage-api.googleapis.com"

  disable_on_destroy = false
}

resource "google_storage_bucket" "artifacts" {
  name          = "xxxxxxx-artifacts-bucket"
  location      = "us-central1"
  force_destroy = true
  public_access_prevention = "enforced"  

  labels = {
    environment = "dev"
  }

  uniform_bucket_level_access = true
}

resource "google_storage_bucket_iam_binding" "artifacts_binding" {
  bucket = google_storage_bucket.artifacts.name
  role   = "roles/storage.objectAdmin"

  members = [
    "serviceAccount:${google_service_account.kubernetes.email}",
  ]
}

I am getting the following error when running my application on Kubernetes:

"error": {
    "code": 403,
    "message": "Caller does not have storage.objects.create access to the Google Cloud Storage object. Permission 'storage.objects.create' denied on resource (or it may not exist).",
    "errors": [
      {
        "message": "Caller does not have storage.objects.create access to the Google Cloud Storage object. Permission 'storage.objects.create' denied on resource (or it may not exist).",
        "domain": "global",
        "reason": "forbidden"
      }
    ]
  }

Is there something I am missing?

I have seen some people talk about using workload identity federation but I'm not sure if this is necessary as I'm using this service account already to read the artifact registry without issue.


Solution

  • First, a disclaimer: I've always defined gke clusters using the gke terraform module so i'm not completely familiar with the manual configuration you are doing. There could be some important configurations missing that i'm not aware of, such as the activation of the metadata service that was discussed in the comments. You will not have these issues if you use the gke terraform module.

    With that said, there are basically 3 ways to authenticate a workload in GKE:

    1. export a service account key, and set it as a kubernetes secret accessible from the pod of your app.
    2. use workload identity to automagically connect a service account to the pod of your app
    3. use the default service account of your node pool, that should be accessible via metadata service by every application running on your cluster.

    1 service account key in kubernetes secret

    This is the most naiive way to handle authentication, but it's also guaranteed to work without any issues. it's the way i would recommend if you can't use workload identity.

    enter image description here

    # define a service account dedicated to your app
    resource "google_service_account" "demo_app" {
      account_id   = "demo_app_service_account"
    }
    # add all the permissions you need to the service account.
    # this is an example permission
    resource "google_storage_bucket_iam_binding" "demo_app" {
      bucket = google_storage_bucket.artifacts.name
      role   = "roles/storage.objectAdmin"
    
      members = [
        "serviceAccount:${google_service_account.demo_app.email}",
      ]
    }
    # generate a service account key
    # this is basically a large json string
    resource "google_service_account_key" "demo_app" {
      service_account_id = google_service_account.demo_app.name
      public_key_type    = "TYPE_X509_PEM_FILE"
      private_key_type = "TYPE_GOOGLE_CREDENTIALS_FILE"
    }
    
    # define a kubernetes secret containing the json service account key
    # Don't use this part if you don't want to manage kubernetes resources wia terraform
    resource "kubernetes_secret" "demo_app" {
      metadata {
        name = "demo-app-service-account"
      }
      type = "opaque"
      data = {
        sa_json = google_service_account_key.demo_app.private_key
      }
    }
    
    

    With this, you can now configure your workload to use the service account key, by mounting the kubernetes secret as instructed here.

    This solution has some security implications:

    • Since we choose to manage the service account key via terraform, it will now be stored in the terraform state. If someone gains access to it, it will be difficult to audit where, how or when it was stolen.
    • If your application is compromised, an attacker will gain access to the service account key, which never expires. Rotating the key is a manual process, and there is no way to automate it this with system. This issue in particular is what workload identity is designed to solve

    2 workload identity

    This is the recommended way to manage authentication on gke, and it's the most secure. You will see that it's not complicated, but you might encounter some issues because it requires some configuration in the cluster. This is the one i'm aware of, there might be others missing:

    resource "google_container_cluster" "cluster" {
      # ...  
      workload_identity_config {
        identity_namespace = "PROJECT.svc.id.goog" #replace PROJECT with your gcp project name
      }
    }
    

    This is how you connect a service account to your app using workload identity:

    You start by defining a service account dedicated to your app, with all the permissions your app needs.

    # define a service account dedicated to your app
    resource "google_service_account" "demo_app" {
      account_id   = "demo_app_service_account"
    }
    # add all the permissions you need to the service account.
    # This is an example permission
    resource "google_storage_bucket_iam_binding" "demo_app" {
      bucket = google_storage_bucket.artifacts.name
      role   = "roles/storage.objectAdmin"
    
      members = [
        "serviceAccount:${google_service_account.demo_app.email}",
      ]
    }
    

    Then, you need to define a kubernetes service account with a special annotations that binds it to your google service account

    resource "kubernetes_service_account" "demo_app" {
      metadata {
        name      = "demo-app-k8s-service-account"
        namespace = "your-app-namespace"
        annotations {
          # This annotation tells the workload identity magic 
          # that this k8s service account is associated to the following IAM service account
          "iam.gke.io/gcp-service-account" = google_service_account.demo_app.email,
    
        }
      }
      automount_service_account_token = false
    }
    
    # Allow the Kubernetes service account to impersonate the IAM service account
    resource "google_service_account_iam_binding" "demo_app" {
      service_account_id = google_service_account.demo_app.name
      role               = "roles/iam.workloadIdentityUser"
    
      members = [
        "serviceAccount:YOUR_PROJECT_ID.svc.id.goog[your-app-namespace/demo-app-k8s-service-account]",
      ]
    }
    
    

    Now all you need to do is set the kubernetes service account of your workload to the name of the kubernetes sevice account you just created. If you are managing resources using terraform, this is what it looks like:

    resource "kubernetes_deployment" "demo_app" {
      spec {
        template {
           spec {
             service_account_name = "demo-app-k8s-service-account"
    
             # .. snip
    

    And that's it. google cloud libraries will automatically detect that they are running on a k8s cluster and will know how to retrieve the credentials. You don't have to provide any environemnt variable, or json file of any sort.

    You can check that everything is working by opening a shell in the pod of your application, and running this command: curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/email

    You should receive back the email of the google service account you associated to the workload: [the-name-of-your-serviceaccount]@[your-namespace].iam.gserviceaccount.com/.

    169.254.169.254 is the ip of the metadata service, a server that is accessible only locally, that applications running on your cluster can access to retrieve different kinds of metadata. gcloud libraries that need to authenticate will always try to retrieve authentication credentials from there. And if you set up workload identity correctly they will retrieve a short lived token with the permissions you configured.

    enter image description here

    3. default service account of the node pool

    This is the approach you are using in the code you provided.

    GKE nodes need a service account to perform non-workload operations such as authenticating to the container registry, or accessing the logging api.

    This service account should not be exposed to the workloads. Instead, you should use a dedicated service account with custom permissions for every workload that needs to access Gcloud apis, like I showed in examples 1 and 2.

    The main reason is that this violates the principle of least privilege: all the pods running on the node pool will share the same privileges and credentials, even though only one pod really requires them.

    With that said, the service account should be available to the workloads in your nodepool via the metadata service. You can debug it in the same way you would debug workload identity access:

    from inside the pod of your app, run: curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/email

    You should receive back the email of the default node pool service account [the-name-of-your-serviceaccount]@[your-namespace].iam.gserviceaccount.com/

    To view the permissions associated to the account, run: curl -s -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/scopes