Search code examples
ruby-on-railsoptimizationrefactoringrails-activerecord

Rails: Code optimization/restructuring requested


I have the following code snippet that works perfectly and as intended:

    # Prepares the object design categories and connects them via bit mapping with the objects.design_category_flag
    def prepare_bit_flag_positions
      # Updates the bit_flag_position and the corresponding data in the object table with one transaction
      ActiveRecord::Base.transaction do
        # Sets the bit flag for object design category
        ObjectDesignCategory.where('0 = (@rownum:=0)').update_all('bit_flag_position = 1 << (@rownum := 1 + @rownum)')

        # Resets the object design category flag
        Object.update_all(design_category_flag: 0)
        # Sets the new object design category bit flag
        object_group_relation = Object.joins(:object_design_categories).select('BIT_OR(bit_flag_position) AS flag, objects.id AS object_id').group(:id)
        join_str = "JOIN (#{object_group_relation.to_sql}) sub ON sub.object_id = objects.id"
        Object.joins(join_str).update_all('design_category_flag = sub.flag')
      end

But in my opinion it is quite difficult to read. So I tried to rewrite this code without raw SQL. What I created was this:

  def prepare_bit_flag_positions
      # Updates the bit_flag_position and the corresponding data in the object table with via transaction
      ActiveRecord::Base.transaction do
        # Sets the bit flag for the object color group
        ObjectColorGroup.find_each.with_index do |group, index|
          group.update(bit_flag_position: 1 << index)
        end

        # Resets the object color group flag
        Object.update_all(color_group_flag: 0)
        # Sets the new object color group bit flag
        Object.find_each do |object|
          object.update(color_group_flag: object.object_color_groups.sum(:bit_flag_position))
        end
      end
    end

This also works fine, but when I run a benchmark for about 2000+ records, the second option is about a factor of 65 slower than the first. So my question is:

Does anyone have an idea how to redesign this code so that it doesn't require raw SQL and is still fast?


Solution

  • I can see 2 sources of slowing:

    1. N+1 problem
    2. Instantiating objects
    3. Calls to DB

    This code has the N+1 Problem. I think this may be the major cause of the slowing.

    Object.find_each do |object|
      object.update(color_group_flag: object.object_color_groups.sum(:bit_flag_position))
    end
    

    Change to

    Object.includes(:object_color_groups).find_each do |object|
      ...
    end
    

    You can also use Object#update class method on this code (see below).

    I don't think you can get around #2 without using raw SQL. But, you will need many objects (10K or 100K or more) to see a big difference.

    To limit the calls to the DB, you can use Object#update class method to update many at once.

    ObjectColorGroup.find_each.with_index do |group, index|
      group.update(bit_flag_position: 1 << index)
    end
    

    to

    color_groups = ObjectColorGroup.with_index.map do |group, index|
      [group.id, { bit_flag_position: group.bit_flag_position: 1 << index }]
    end.to_h
    ObjectColorGroup.update(color_groups.keys, color_groups.values)
    

    The following is a single query, so no need to change.

    Object.update_all(color_group_flag: 0)
    

    Reference:

    1. ActiveRecord#update class method API
    2. ActiveRecord#update class method blog post
    3. Rails Eager Loading