Search code examples
ruby-on-railsarel

Trying to speed up finding an index in active record query


I have a page where I start with a list of check_ins or client visits and then I need to loop through each visit (i) to find all that client's other visits to determine the visits neighbor (i - 1). I need to pull information from that visits neighbor as well. This is taking a very long time and I thought there might be a speedier way to perform the query.

Assuming I start with the list of visits or check_ins:

check_ins.each_with_index do |ci, i|

  # access the client
  client = ci.client 

  # pull other relevent client visits
  client_cis = client.check_ins
    .visible
    .includes(:weigh_in)
    .select(:id, :week, :created_at, :type_of_weighin)
    .where.not(check_ins: {type_of_weighin: nil})
    .where.not(weigh_ins: {id: nil})
    .where("weigh_ins.date >= ?", ci.created_at - 3.weeks) 

  # access the visit's index
  i = client_cis.index(ci)

  # access the visit's neighbor
  neighbor = client_cis[ i - 1 ]

  # perform downstream calculations using visit and it's neighbor
  ...

end

Is there a faster way to perform this type of analysis?


Solution

  • You could try this although I make no performance improvement guarantees the following should result in the same functional mechanism

    grouped_check_ins = check_ins.group_by(&:client_id)
    weigh_ins_table = WeighIn.arel_table
    check_ins_table = CheckIn.arel_table 
    weigh_ins_join = Arel::Nodes::InnerJoin.new(
      weigh_ins_table,weigh_ins_table.create_on(
       weigh_ins_table[:check_in_id].eq(check_ins_table[:id]).and(
         weigh_ins_table[:date].gteq(
           check_ins_table[:created_at] - Arel.sql("interval '3 week'")
         ) 
        )
      )
    )
    Client
      .joins(:check_ins)
      .joins(weigh_ins_join)
      .eager_load(check_ins: :weigh_in)
      .where(id: grouped_check_ins.keys) # alternately .where(id: check_ins.select(:client_id))
      .where.not(check_ins: {type_of_weighin: nil})
      .each do |client|
        client_check_ins_with_neighbor = client.check_ins.each_cons(2)
           .select do |_,b| 
             grouped_check_ins[client.id].any? {|c| c.id == b.id }
           end 
        client_check_ins_with_neighbor.each do |neighbor,check_in| 
           # perform downstream calculations using visit and it's neighbor
        end  
      end 
    

    UPDATE to explain what this code does per the OP's request:

    • grouped_check_ins = check_ins.group_by(&:client_id) - group your CheckIn objects by the client_id for faster lookup. This results in a Hash of {client_id => [CheckIns]}

    • Arel work - This is so we can build a join with multiple conditions. In this case the conditions are "weigh_ins.check_in_id = check_ins.id AND weigh_ins.date >= check_ins.created_at - interval '3 week'"

    weigh_ins_table = WeighIn.arel_table
    check_ins_table = CheckIn.arel_table 
    weigh_ins_join = Arel::Nodes::InnerJoin.new(
      weigh_ins_table,weigh_ins_table.create_on(
       weigh_ins_table[:check_in_id].eq(check_ins_table[:id]).and(
         weigh_ins_table[:date].gteq(
           check_ins_table[:created_at] - Arel.sql("interval '3 week'")
         ) 
        )
      )
    )
    
    • Build the Query
    Client
      .joins(:check_ins)
      .joins(weigh_ins_join)
      .eager_load(check_ins: :weigh_in)
      .where(id: grouped_check_ins.keys) # alternately .where(id: check_ins.select(:client_id))
      .where.not(check_ins: {type_of_weighin: nil})
    

    This results in SQL akin to

    SELECT 
    /* every column from each table listed 
    with names generated by rails so that it can 
    perform eager loading appropriately */ 
    FROM 
      clients
      INNER JOIN check_ins ON check_ins.client_id = clients.id 
      INNER JOIN weigh_ins ON weigh_ins.check_in_id = check_ins.id 
        AND weigh_ins.date >= check_ins.created_at - interval '3 week'
    WHERE 
      clients.id IN (/* list of client_ids from the group_by */) 
    AND check_ins.type_of_weighin IS NOT NULL 
    
    • Iterate over the clients check ins in consecutive order, 2 at a time. Select each pair where the second item is in the list of client check ins originally provided.
    client_check_ins_with_neighbor = client.check_ins.each_cons(2)
      .select do |_,b| 
        grouped_check_ins[client.id].any? {|c| c.id == b.id }
    end
    
    • Then we just iterate over those selected pairs to perform whatever work needs to be done
    client_check_ins_with_neighbor.each do |neighbor,check_in| 
      # perform downstream calculations using visit and it's neighbor
    end