ruby-on-rails concurrency activemodel database-locking

How do I ensure a model always uses a transaction and locks (in Rails)?

I noticed that Rails can have concurrency issues with multiple servers and would like to force my model to always lock. Is this possible in Rails, similar to unique constraints to force data integrity? Or does it just require careful programming?

Terminal One

irb(main):033:0* Vote.transaction do
irb(main):034:1* v = Vote.lock.first
irb(main):035:1> v.vote += 1
irb(main):036:1> sleep 60
irb(main):037:1> v.save
irb(main):038:1> end

Terminal Two, while sleeping

irb(main):240:0* Vote.transaction do
irb(main):241:1* v = Vote.first
irb(main):242:1> v.vote += 1
irb(main):243:1> v.save
irb(main):244:1> end

DB Start

 select * from votes where id = 1;
 id | vote |         created_at         |         updated_at         
----+------+----------------------------+----------------------------
  1 |    0 | 2013-09-30 02:29:28.740377 | 2013-12-28 20:42:58.875973

After execution

Terminal One

irb(main):040:0> v.vote
=> 1

Terminal Two

irb(main):245:0> v.vote
=> 1

DB End

select * from votes where id = 1;
 id | vote |         created_at         |         updated_at         
----+------+----------------------------+----------------------------
  1 |    1 | 2013-09-30 02:29:28.740377 | 2013-12-28 20:44:10.276601

Other Example

http://rhnh.net/2010/06/30/acts-as-list-will-break-in-production

Solution

You are correct that transactions by themselves don't protect against many common concurrency scenarios, incrementing a counter being one of them. There isn't a general way to force a lock, you have to ensure you use it everywhere necessary in your code

For the simple counter incrementing scenario there are two mechanisms that will work well:

Row Locking

Row locking will work as long as you do it everywhere in your code where it matters. Knowing where it matters may take some experience to get an instinct for :/. If, as in your above code, you have two places where a resource needs concurrency protection and you only lock in one, you will have concurrency issues.

You want to use the with_lock form; this does a transaction and a row-level lock (table locks are obviously going to scale much more poorly than row locks, although for tables with few rows there is no difference as postgresql (not sure about mysql) will use a table lock anyway. This looks like this:

    v = Vote.first
    v.with_lock do
      v.vote +=1
      sleep 10
      v.save
    end

The with_lock creates a transaction, locks the row the object represents, and reloads the objects attributes all in one step, minimizing the opportunity for bugs in your code. However this does not necessarily help you with concurrency issues involving the interaction of multiple objects. It can work if a) all possible interactions depend on one object, and you always lock that object and b) the other objects each only interact with one instance of that object, e.g. locking a user row and doing stuff with objects which all belong_to (possibly indirectly) that user object.

Serializable Transactions

The other possibility is to use serializable transaction. Since 9.1, Postgresql has "real" serializable transactions. This can perform much better than locking rows (though it is unlikely to matter in the simple counter incrementing usecase)

The best way to understand what serializable transactions give you is this: if you take all the possible orderings of all the (isolation: :serializable) transactions in your app, what happens when your app is running is guaranteed to always correspond with one of those orderings. With ordinary transactions this is not guaranteed to be true.

However, what you have to do in exchange is to take care of what happens when a transaction fails because the database is unable to guarantee that it was serializable. In the case of the counter increment, all we need to do is retry:

    begin
      Vote.transaction(isolation: :serializable) do
        v = Vote.first
        v.vote += 1
        sleep 10 # this is to simulate concurrency 
        v.save
      end
    rescue ActiveRecord::StatementInvalid => e
      sleep rand/100 # this is NECESSARY in scalable real-world code, 
                     # although the amount of sleep is something you can tune.
      retry
    end

Note the random sleep before the retry. This is necessary because failed serializable transactions have a non-trivial cost, so if we don't sleep, multiple processes contending for the same resource can swamp the db. In a heavily concurrent app you may need to gradually increase the sleep with each retry. The random is VERY important to avoid harmonic deadlocks -- if all the processes sleep the same amount of time they can get into a rhythm with each other, where they all are sleeping and the system is idle and then they all try for the lock at the same time and the system deadlocks causing all but one to sleep again.

When the transaction that needs to be serializable involves interaction with a source of concurrency other than the database, you may still have to use row-level locks to accomplish what you need. An example of this would be when a state machine transition determines what state to transition to based on a query to something other than the db, like a third-party API. In this case you need to lock the row representing the object with the state machine while the third party API is queried. You cannot nest transactions inside serializable transactions, so you would have to use object.lock! instead of with_lock.

Another thing to be aware of is that any objects fetched outside the transaction(isolation: :serializable) should have reload called on them before use inside the transaction.