Search code examples
ruby-on-railsruby

Add two arrays of objects together while merging by a specific value in Ruby


I am working with Ruby On Rails and I have two SQL queries where each returns an array of objects. These two SQL queries are from two seperate DB's so they have different fields, however, I am supposed to merge the results into one list. Here is an example of the two SQL queries and what the end result should look like.

SQL1 result == [{id1: 1, name: john, role: user},{id1: 2, name: matt, role: admin}]
SQL2 result == [{id2: 4, externalName: john},{id2: 8, externalName: ronald}]

We want to combine any objects where name == externalName into one object but leave the rest as is.

RESULT == [{id1: 1, name: john, role: user, id2: 4},{id1: 2, name: matt, role: admin},{id2: 8, externalName: ronald}]

Basically, I want to add both lists together, however, when :externalName from the second list is equal to a :name value from the first list or vice versa, we want to combine these two seperate objects into one, as they would be considered "duplicates". The combined result of two objects would look like this {:id1, :name, :role, :id2}. The rest of the objects that dont satisfy this would still appear within the resulting array.

I've done a concat which just adds both arrays together into a nice array, however it does not merge on name == externalName. I have thought about using .merge() but not sure how I could do this on two entire array of objects and what would happen if the merge condition is not met, would those still appear in the resulting list. I've also thought about a map but then would need to figure out how to remove the resulting object after merging it into the other. Not sure what the best approach is right now, just looking for some tips.


Solution

  • You could perform this as follows:

    sql1 = [{id1: 1, name: 'john', role: 'user'},{id1: 2, name: 'matt', role: 'admin'}]
    sql2 = [{id2: 4, externalName: 'john'},{id2: 8, externalName: 'ronald'}]
    [*sql1,*sql2]
      .group_by { |h| h[:name] || h[:externalName] }
      .flat_map do |k,v| 
         v.reduce(&:merge).tap { |h| h.delete(:externalName) if h.key?(:name) }
      end
    #=> => [{:id1=>1, :name=>"john", :role=>"user", :id2=>4}, {:id1=>2, :name=>"matt", :role=>"admin"}, {:id1=>17, :role=>"user"}, {:id2=>8, :externalName=>"ronald"}]
    

    Steps:

    • [*sql1,*sql2] - Combine both Arrays

    • .group_by {|h| h[:name] || h[:externalName]} - Group the elements by the value of :name or :externalName if h[:name] is falsey (assumed nil in this instance)

    • map do |_,v| - will return a new Array based on the return value of the block, when called on Hash (as returned by group_by) it will yield key and value to the block. In this case we do not care about the key so we indicate this using _.

      • v.reduce(&:merge) - merge the values into a single Hash
      • tap { |h| h.delete(:externalName) if h.key?(:name) } - tap yields the resulting Hash to the block and we delete the :externalName key if the Hash has a name key. This block will always return the yielded object.

    Update Apparently name is not a required key, resulting in a large group of nil values, where merge reduction is not an option. Updated code to handle nil grouping.

    sql1 = [{id1: 1, name: 'john', role: 'user'},{id1: 2, name: 'matt', role: 'admin'}, {id1: 17, role: 'user'}, {id1: 12, role: 'admin'} ]
    sql2 = [{id2: 4, externalName: 'john'},{id2: 8, externalName: 'ronald'}, {id2: 42}]
    [*sql1,*sql2]
      .group_by { |h| h[:name] || h[:externalName] }
      .flat_map do |k,v| 
         next v unless k
         v.reduce(&:merge).tap { |h| h.delete(:externalName) if h.key?(:name) }
      end
    #=> [{:id1=>1, :name=>"john", :role=>"user", :id2=>4}, {:id1=>2, :name=>"matt", :role=>"admin"}, {:id1=>17, :role=>"user"}, {:id1=>12, :role=>"admin"}, {:id2=>42}, {:id2=>8, :externalName=>"ronald"}]
    

    Other Assumptions

    These are assumptions based on the post:

    • There are no duplicate names in sql1 or duplicate externalNames in sql2 (resulting in a maximum group of 2; 1 value from sql1 and 1 value from sql2)
    • There are no overlapping keys between sql1 and sql2. Otherwise the resulting value will be that of sql2