Search code examples
pythonamazon-web-servicescassandraamazon-keyspaces

Why is connecting to AWS keyspaces so slow with python's cassandra-driver?


I have one API, is an flask application with python deployed on AWS EC2. Some endpoints need to connect on AWS Keyspace for make a query. But the method cluster.connect() is too slow, takes 5 seconds for connect and then run the query.

What I did to solve it, was to start a connection when the application starts (when a commit is done on the master branch, I'm using CodePipeline), and then the connection is open all the time.

I didn't find anything in the python cassandra driver documentation against this, is there any potential problem with this solution that I found?


Solution

  • It's a recommended way - open connection at start and keep it (and have one connection per application). Opening connection to a Cassandra cluster is an expensive operation, because besides connection itself, driver discovers the topology of the cluster, calculate token ranges, and many other things. Usually, for "normal" Cassandra this shouldn't be very long (but still expensive), and AWS's emulation may add an additional latency on top of it.