Search code examples
pythongoogle-app-enginegoogle-cloud-sqlapp-engine-flexible

How do database connections work in Google App Engine python + Google Cloud SQL?


I have a working Python 3 app on Google App Engine (flexible) connecting to Postgres on Google Cloud SQL. I got it working by following the docs, which at some point have you connecting via psycopg2 to a database specifier like this

postgresql://postgres:password@/dbname?host=/cloudsql/project:us-central1:dbhost

I'm trying to understand how the hostname /cloudsql/project:us-central1:dbhost works. Google calls this an "instance connection string" and it sure looks like it's playing the role of a regular hostname. Only with the / and : it's not a valid name for a DNS resolver.

Is Google's flexible Python environment modified somehow to support special hostnames like this? It looks like stock Python 3 and psycopg2, but maybe it's modified somehow? If so, are those modifications documented anywhere? The docs for the Python runtime don't have anything about this.


Solution

  • It turns out that host=/cloudsql/project:us-central1:dbhost specifies the name of a directory in the filesystem. Inside that directory is a file named .s.PGSQL.5432 which is a Unix domain socket. An instance of Cloud SQL Proxy is listening on that Unix domain socket and forwarding database requests via TCP to the actual database server. Despite the DB connection string being host= it actually names a directory with a Unix socket in it; that's a feature of libpq, the Postgres connection library.

    Many thanks to Vadim for answering my question quickly in a comment, just writing up a more complete answer here for future readers.