Search code examples
ruby-on-railsjoinrails-activerecordcoalesce

Access associated records when using custom join on coalesce-d id


I have two models in my Rails app, Organization and Warehouse. Entries in Warehouse belong to entries in Organization either via Warehouse.organization_id ('owned' warehouses) or Warehouse.delegated_to_organization_id ('delegated' warehouses).

A use case requires me to go over all warehouses and their 'responsible' organization: The 'delegating' organization if existing, otherwise the 'owning' organization. However, the use cases requires me to have all warehouses grouped by organization and is executed in batches (1), leading me to this approach:

def organizations
  Organization.joins(warehouse_join)
end

def warehouse_join
  <<~SQL
    INNER JOIN warehouses
    ON coalesce(
      warehouses.delegated_to_organization_id,
      warehouses.organization_id,
      -1
    ) = organizations.id
  SQL
end

However, ActiveRecord seemingly ignores the join: organizations.first.warehouses always yields [], while executing the query with ActiveRecord::Base.connection.execute(organizations.to_sql.gsub('"organizations".*', '*')).to_a shows joined fields from the warehouses table.

What am I doing wrong and how can I make this work?


(1) The data volume is too large to read everything into memory, so the actual query is executed in batches (having a limit attached).


Solution

  • organizations.first.warehouses will only fetch warehouses by organization_id. Your warehouse_join does not affect this relation.

    Also, in order to have associated objects available through an individual object of the queried collection without extra DB queries you have to use preloading (eager loading) methods such as includes.

    For example, if you want to fetch a list of organizations with their warehouses, you should do that like this:

    orgs = organizations.includes(:warehouses)
    
    # now you can access `warehouses` of each individual `organization` without extra DB queries
    orgs.first.warehouses
    orgs.second.warehouses
    

    As far as I know it is impossible (or not so easy) in ActiveRecord to use includes with a custom JOIN condition.

    But in your case I wouldn't try to do that, there is another way. Use two separate associations (one for each foreign key):

    class Organization < ActiveRecord::Model
      # delegated warehouses         
      has_many :delegated_warehouses, class_name: "Warehouse", 
                                      foreign_key: "delegated_to_organization_id" 
      # owned, but not delegated to any other organization
      has_many :owned_not_delegated_warehouses, -> { where(delegated_to_organization_id: nil) }, 
                                                class_name: "Warehouse", 
                                                foreign_key: organization_id                                  
    
      def warehouses_under_responsibility
        # do not do like this!!! those associations are for preloading only!
        delegated_warehouses + owned_not_delegated_warehouses # wrong!!!
    
        # the right way
        Warehouse.where("coalesce(delegated_to_organization_id, organization_id, -1) = ?", id)
      end
    end
    

    Now you can use the associations to separately preload (eager load) warehouses that are under responsibility.

    orgs = Organization.includes(:delegated_warehouses, :owned_not_delegated_warehouses)
    
    orgs.each do |org|
      # now you can access the associated records separately
      owned_warehouses = org.owned_not_delegated_warehouses
      delegated_warehouses = org.delegated_warehouses
    
      # or use them all together
      all_warehouses = owned_warehouses + delegated_warehouses
    end
    

    Note:

    Those associations should only be used when preloading warehouses, i.e. when there is a large collection of organization and you need to fetch the warehouses they are responsible for.

    If you have a single organization and need to fetch the warehouses it is responsible for, you should use another approach, which is shown in the warehouses_under_responsibility method.

    It is important to understand why we didn't use the same approach when preloading. That is because we can't make include work with the custom join condition which is coalesce(delegated_to_organization_id, organization_id, -1) = organizations.id