Search code examples
google-cloud-platformairflowgoogle-cloud-composercloud-sql-proxy

Cloud Composer issue loading DAG connecting to Cloud SQL


I've made a DAG which connects to Cloud SQL (MySQL) through a Cloud SQL Proxy installed on a GCE. It reads a list of tables and generates a number of tasks based on these. I've run this DAG in Airflow locally on my machine with success, but once I try to deploy it a Cloud Composer instance, the DAG doesn't seem to load properly into the Airflow Web UI. The only options available for the DAG are refresh and delete, and not all the other ones.

The DAG is found by the scheduler and I can see in the logs that a connection is made to Cloud SQL, retrieving the tables, but for some reason the Airflow web UI doesn't like it. There are no errors in the log.

I am aware of the architecture of Composer as depicted here: https://cloud.google.com/composer/docs/concepts/overview, and I'm wondering if it has something to do with the admin web UI being in a tenant project. I have however tried to open up the firewall to all connections from everywhere briefly to see if it was a firewall issue, but no luck. So I'm thinking it might be a routing issue.

The code which connects to the Cloud SQL Proxy looks like this:

with connection.cursor(pymysql.cursors.DictCursor) as cursor:
    sql = "select <redacted>"
    cursor.execute(sql)
    result = cursor.fetchall()

I create the cluster like this:

gcloud composer environments create comp-etl-runner \
--disk-size="30GB" --location="europe-west1" --zone="europe-west1-b" \
--machine-type="n1-standard-1" --node-count=3 \
--service-account="<redacted>" \
--python-version=3 --image-version="composer-1.7.2-airflow-1.10.2" --network="dev-network-1" \
--subnetwork="dev-subnet-3"

I've tried enabling ip aliasing and specifying the ip ranges like this:

gcloud beta composer environments create comp-etl-runner \
--disk-size="30GB" --location="europe-west1" --zone="europe-west1-b" \
--machine-type="n1-standard-1" --node-count=3 \
--service-account="<redacted>" \
--python-version=3 --image-version="composer-1.7.2-airflow-1.10.2" --network="dev-network-1" \
--subnetwork="dev-subnet-3" \
--enable-ip-alias \
--cluster-ipv4-cidr="10.207.0.0/19" \
--services-ipv4-cidr="10.207.32.0/19"

But that didn't make a difference.

I also tried adding these two parameters:

--enable-private-environment \
--master-ipv4-cidr="10.207.64.0/19" \

but then the environment creation just fails.

I'm tearing my hair out as my DAG is working perfectly in Airflow on my machine, but not in Cloud Composer. So any ideas would be greatly appreciated.


Solution

  • I believe that the most suitable workaround is to deploy a self-managed webserver (as described here) in the same GKE cluster to be able to go through the Cloud SQL proxy.

    Another option is to use a Public IP for your CloudSQL instance and whitelist everything making it accessible on the public internet. I am not sure if you can afford this in your use case though. If you choose this option, you should configure your instance to use SSL to maximize security.