Search code examples
ruby-on-railsrubymultithreadingrace-condition

Understanding race_condition_ttl in Rails


I am trying to understand the race_condition_ttl directive in Rails when using Rails.cache.fetch.

I have a controller action that looks like this:

def foo
  @foo = Rails.cache.fetch("foo-testing", expires_in: 30.seconds, race_condition_ttl: 60.seconds) do
      Time.now.to_s
  end
  @foo # this gets used in a view down the line...
end

Based on what I'm reading in the Rails docs, this value should expire after 30 seconds, but the stale value is allowed to be served for another 60 seconds. However, I can't figure out how to reproduce conditions that will show me this behavior working. Here is how I'm trying to test it.

100.times.map do 
  t = Thread.new { RestClient.get("http://myenvironment/foo") }
  t 
end.map {|t| t.join.value }.uniq

I have my Rails app running on a VM behind a standard nginx/unicorn setup. I am trying to spawn 100 threads hitting the site simultaneously to simulate the "dog pile effect". However, when I run my test code, all the threads report the same value back. What I would expect to see is that one thread gets the fresh value, while at least one other thread gets served some stale content.

Any pointers are welcome! Thanks so much.


Solution

  • You are setting race_condition_ttl to 60 seconds which means your threads will only start getting the new value after this time expires, even not taking into account the initial 30 seconds.

    Your test doesn't look like it would take 1.5 minutes to run which would be required in order for the threads to start getting the new value. From the Rails Cache docs:

    Yes, this process is extending the time for a stale value by another few seconds. Because of extended life of the previous cache, other processes will continue to use slightly stale data for a just a bit longer.

    The text implies using a small race_condition_ttl and it makes sense both for its purpose and your test.

    UPDATE

    Also note that the life of stale cache is extended only if it expired recently. Otherwise a new value is generated and :race_condition_ttl does not play any role.

    Without reading source it is not particularly clear how Rails decides when its server is getting hammered or what exactly recently means in the quote above. It seems clear though that the first process (of many) of those waiting to access the cache gets to set the new value while extending life of the previous one. The presence of waiting processes might be the condition Rails looks for. In any case the expected behaviour should be observed after both initial timeout and ttl expire and cache starts serving the updated value. The delay between initial timeout and the time new value starts showing up should be similar to the ttl. Of course the precondition is the server should be hammered around the moment of initial timeout expiration.