Search code examples
ruby-on-railsrubymemorymemory-managementworker

How can I optimise this method in Ruby using preload, includes, or eager_load?


I want to reduce allocations and speed up a Ruby worker. I've been reading about eager loading, but I don't fully understand it yet. Here's the method:

def perform(study_id, timestamp)
  study = Study.includes(:questions, :participants).find(study_id)
  questions = study.questions.not_random.not_paused
  participants = study.participants
  return unless questions && participants

  end_timestamp = timestamp_window(timestamp)

  participants.each do |participant|
    process_participant(participant, questions, timestamp, end_timestamp, study)
  end
end

I was hoping that Study.includes() would reduce the number of database queries, but looking at Skylight, it doesn't seem to have changed anything:

Screenshot from Skylight showing 4 queries

Am I using includes incorrectly, or should I be using something else?


Solution

  • The example you've given doesn't seem like it's benefiting much from eager loading. Its utility is to avoid N+1 queries; something like this:

    User.first(100).each do |user|
      comments = user.comments
    end
    

    This will make 1 query for the 100 users, and 100 queries for the comments. That's why it's called N+1 (N being 100 here).

    To prevent this from happening, you'd use eager loading:

    User.first(100).includes(:comments).each do |user|
      comments = user.comments
    end
    

    Now it makes two queries - one for the users and one for the comments. The fact that it makes 2 queries instead of 1 isn't a problem. Part of optimization (big O) is to find bottlenecks at different 'scales'. I'm not going to explain all that, but this is a good tutorial: https://samurails.com/interview/big-o-notation-complexity-ruby/

    In the example without eager loading, the time complexity is O(N), which means 'linear'. The time required increases linearly with the value of N. If you use eager loading, though, then you can increase N without adding additional queries, and it's a O(1) complexity - constant time.

    In your case, you have a method that makes three queries:

    • Study (find one)
    • associated questions
    • associated participants

    An easy way to determine if you should use eager loading is to check your code for any SQL fetching that happens inside a loop. That's not happening here, so the eager loading won't do much. For example, it'd be good to use includes if you were instead fetching associated data for a list of studies.

    It might technically possible to make a SQL query that gets all three tables' data in a single request, but I don't think ActiveRecord has anything to do it for you. It's probably unnecessary, though. If you're not convinced you can try writing that SQL yourself and report on the performance gains.