Search code examples
google-kubernetes-enginegoogle-cloud-sqlcloud-sql-proxygke-networking

Connecting to public Cloud SQL from a private GKE cluster


I'm trying to connect to a Cloud SQL instance with a public IP from a GKE cluster using cloud-sql-proxy. I created my cluster with the following commands:

gcloud services enable compute.googleapis.com
gcloud services enable container.googleapis.com
gcloud container clusters create my-cluster \
  --disk-size=10GB \
  --machine-type=e2-small \
  --node-locations=us-central1-b,us-central1-c,us-central1-f \
  --num-nodes=1 \
  --preemptible \
  --release-channel=regular \
  --workload-pool=my-production.svc.id.goog \
  --zone=us-central1-f \
  --no-enable-master-authorized-networks \
  --enable-ip-alias \
  --enable-private-nodes \
  --master-ipv4-cidr 172.16.0.32/28

I created my Cloud SQL with the following commands:

gcloud sql instances create my-db \
  --database-version=POSTGRES_12 \
  --region=us-central1 \
  --storage-auto-increase \
  --storage-size=10 \
  --storage-type=SSD \
  --tier=db-f1-micro

I also set up a service account with these commands:

gcloud iam service-accounts create my-service-account
gcloud iam service-accounts add-iam-policy-binding \
  --role=roles/iam.workloadIdentityUser \
  --member="serviceAccount:my-production.svc.id.goog[default/my-service-account]" \
  [email protected]
gcloud projects add-iam-policy-binding my-production \
  --member serviceAccount:"[email protected]" \
  --role "roles/cloudsql.client"

The sidecar container for cloud-sql-proxy in the pod is set up like this:

      - name: cloud-sql-proxy
        image: gcr.io/cloudsql-docker/gce-proxy:1.20.2
        command:
          - "/cloud_sql_proxy"
          - "-instances=my-production:us-central1:my-db=tcp:5432"
          - "-term_timeout=20s"

Despite all that, when my app tries to connect to the Cloud SQL instance, I can see the following error in cloud-sql-proxy logs:

2021/03/19 21:30:02 couldn't connect to "my-production:us-central1:my-db": dial tcp MY_DB_PUBLIC_IP:3307: connect: connection timed out

I checked and the pod has Internet access (I can access www.google.com) so it should be able to connect to Cloud SQL's public IP. I can use cloud-sql-proxy without problems on my laptop and connect to the instance there. What am I missing? What else can I check?

I found GKE private cluster and cloud sql proxy connection but I have SQL Admin API enabled. Connection between Private GKE and Cloud SQL only talks about a GKE cluster that has no Internet access.


Solution

  • The conclusion is:

    • Don't use www.google.com to check if a node has Internet access on GCP's infrastructure. It seems that www.google.com is viewed as internal traffic on GCP. Trying to fetch www.amazon.com fails on a private GKE cluster.
    • To connect to Cloud SQL from a private GKE cluster use a private IP for Cloud SQL (remember to add -ip_address_types=PRIVATE to cloud-sql-proxy).