The documentation doesn't say anything about that (http://www.ruby-doc.org/core-2.2.0/Array.html#method-i-uniq).
Also, is it using a naive O(n^2) search or something else like a hashmap ? In the latter case, should I understand that my elements must have a proper implementation of hash
and eql?
when I want to unicize them ?
Given the code (in C) for the Array#uniq
rb_ary_uniq(VALUE ary)
{
VALUE hash, uniq, v;
long i;
if (RARRAY_LEN(ary) <= 1)
return rb_ary_dup(ary);
if (rb_block_given_p()) {
hash = ary_make_hash_by(ary);
uniq = ary_new(rb_obj_class(ary), RHASH_SIZE(hash));
st_foreach(RHASH_TBL(hash), push_value, uniq);
}
else {
hash = ary_make_hash(ary);
uniq = ary_new(rb_obj_class(ary), RHASH_SIZE(hash));
for (i=0; i<RARRAY_LEN(ary); i++) {
st_data_t vv = (st_data_t)(v = rb_ary_elt(ary, i));
if (st_delete(RHASH_TBL(hash), &vv, 0)) {
rb_ary_push(uniq, v);
}
}
}
ary_recycle_hash(hash);
return uniq;
}
In the general case (the else
block), it creates a hash from the array (that unifies the key without keeping the order). Then it create a new empty array with the right size. Finally it go through the first Array and when it finds the key in the hash it delete that key and push it to the empty array.
Hence the order is kept.
I'd say the complexity is O(complexity(ary_make_hash) + N)
in time, which is probably O(N)
should I understand that my elements must have a proper implementation of
hash
andeql?
when I want to unicize them ?
Yes exactly, they should implement both hash
and eql?
methods, which almost all Object
s do.
We can see that if we were to call uniq
on an array of BasicObject
s, which do not define the hash
method, we would get an error:
irb(main):023:0> [BasicObject.new, BasicObject.new].uniq
Traceback (most recent call last):
5: from /usr/bin/irb:23:in `<main>'
#...
NoMethodError (undefined method `hash' for #<BasicObject:0x000000014311a1e0>)