Search code examples
mysqldatabaselooker

Shared aurora writer for analytics causing lock wait timeouts on production


Before I moved to aurora, I had a standard master->slave configuration that isolated Looker, my analytics platform. On the slave I had the tx_isolation db parameter set to READ-COMMITTED to solve lock wait issues.

Now that I've moved everything over to Aurora MySQL 5.7 and everything is in one database cluster, I can't do the tx_isolation trick on the writer anymore since that would cause production data inconsistency. Now analytics queries cause lock wait timeout exceeded errors.

This usually occurs on queries that generate large temporary tables from production data holding a lock that causes outages on our production website.


Solution

  • A workaround is to modify the mysql connection connection parameters used by the analytics engine.

    You can pass a variable sessionVariables=tx_isolation='READ-COMMITTED' This ensures that analytics queries have a lower isolation level so they don't cause lock issues, while maintaining the ACIDity of production queries.

    In Looker there's the Additional Params field you can just copy paste that string to.