Search code examples
mariadbsemaphoredatabase-replicationtable-lockingmaster-master

How to acquire exclusive lock in replicated mysql and clustered servers?


I have this scenario:
My php web application is deployed in a clustered environment, it uses a replicated mariadb database with galera. It has to acquire an exclusive lock on database row resource and proceed with a process that has to be done abolutely once and from just one server.

I must be sure that in case of concurrent multiple requests coming to the same, or different, servers just one request will pass.

When the application was warking in a single server environmente this was the logic:

  1. There was a column in resource table with the process status, let say processed with default 0. 0=not processed, 1=already processed.
  2. The isolation transaction level was set to READ COMMITTED.
  3. When the controller acquired the request, it entered in transaction with this query: SELECT processed FROM mytable WHERE id=? FOR UPDATE.
  4. If the query returned processed=0, It safetly proceded with the process, updated the processes status to 1 and commited the transaction. Otherwise, if processed value is >0 it would have skipped the process.

I have tested many times this implementation and it worked. The concurrent requests will be stopped waiting for the READ LOCK on the id index and just the first one proceeded to the next step. The isolation level to READ COMMITTED will ensure to get the last updated value of the semaphore column.

The problems started with the replicated database environment.

It looks like the READ LOCK in replicated MariaDB is optimistic, it starts the transation without waiting the lock acquisition and it will check for collision only at commit time.

So if I have 2 concurrent transactions, both are not stopped at FOR UPDATE query and both will get 0 value for the processed semaphore field. At commit one will win and other will fail. But it is too late. Of course this is not acceptable in my case because I have already let run my unique process two times.

How can I safely change my logic to suit the replicated environment?
How can be implemented a patter of exclusive resource lock with a replicated db?


Solution

  • Since Galera is not designed for write scalability nor synchronous replication, the ideal pattern for conflict-free usage of Galera is to always write to a single node but read from any node. The exact problem you are facing is why people tend to avoid writing to multiple Galera nodes at the same time: deadlocks on COMMIT seem to be rarely handled by the application in a graceful manner.

    A convenient solution to this problem that requires no changes in your application is to hide the complexity of the Galera cluster behind a smart SQL load balancer like MariaDB MaxScale. This will make your locking pattern work out-of-the-box with a Galera cluster. However, this does come at the cost of having to run the proxy on a separate node.

    In MaxScale, the Galera monitor module takes care of figuring out which node to write to in a consistent manner across multiple MaxScale instances. The readwritesplit router is what takes care of load balancing the reads and writes correctly. Combined together, they make MaxScale appear as if it was a single database but with the robustness of a whole Galera cluster.

    Here's an example configuration that takes care of sending writes to only one Galera node while still distributing reads to all nodes, all in a transactionally safe manner.

    [maxscale]
    threads=auto
    
    [node-1]
    type=server
    address=192.168.0.2
    
    [node-2]
    type=server
    address=192.168.0.3
    
    [node-3]
    type=server
    address=192.168.0.4
    
    [Galera-Cluster]
    type=monitor
    servers=node-1,node-2,node-3
    user=monitor_user
    password=monitor_pw
    
    [RW-Router]
    type=service
    router=readwritesplit
    cluster=Galera-Cluster
    user=service_user
    password=service_pw
    
    [RW-Port]
    type=listener
    service=RW-Router
    port=3306
    

    The MaxScale tutorial has more information about how to get started.