Search code examples
arraysrubyalgorithmenumerable

Subtracting 2 dimensional-arrays with a unique id column


I'd like to subtract two 2 dimensional-arrays, optionally by placing attention on a "unique id" column.

Also curious about a more description way to say what I'm looking for.

But, for example, given two arrays:

big = [['foo','bar@','baz'],
       ['cat','moew@','purr'],
       ['dog','bark@','woof'],
       ['mew', 'two@', 'blue']]

little = [['foo','bar@','baz'],
          ['dog','moew@','woof'],
          ['dog','bark@','woof']]

Then we can subtract them:

big - little #=> [["cat", "moew@", "purr"], ["mew", "two@", "blue"]]

Which works because ['cat','moew@','purr'] != ['dog','moew@','woof']. However, I'd like those two to be considered equal because they have the same value in the "unique id" column.

This is how I solved it:

big = [['foo','bar@','baz'],
       ['cat','moew@','purr'],
       ['dog','bark@','woof'],
       ['mew', 'two@', 'blue']]

little = [['foo','bar@','baz'],
          ['dog','moew@','woof'],
          ['dog','bark@','woof']]


def subtract big, little, key_index=nil
  return big - little unless key_index
  little_keys = little.map { |row| row[key_index] }.flatten
  big.inject([]) do |result, row|
    result << row unless little_keys.grep(row[key_index]).any?
    result
  end
end

subtract(big,little) #=> [["cat", "moew@", "purr"], ["mew", "two@", "blue"]]
subtract(big, little, 1) #=> [["mew", "two@", "blue"]]

Am curious to know more about how to describe what I'm trying to do and if there is a better way to do it.

Also, is my way O(n^2) because it's going through the entire array twice? Once for the #inject and once for #grep?


Solution

  • I'm on a bus and cannot type code easily. If you have unique ids, you could convert the 2D array to a hash with unique id as key and array as value.

    big = { 'bar@' => ['foo','bar@','baz'], ... 
    

    The data you want should be something like big.values_at(*(big.keys-little.keys))