I have generated an array of numbers. I would like to remove the duplicates. In javascript I can just use [...new Set(arr)]
and get the job done.
In Rust, I havn't found a simple way to achieve that so far.
I've written:
use rand::{thread_rng, Rng};
use itertools::Itertools;
fn main() {
let mut arr:Vec<u8> = Vec::new();
for _ in 0..10 {
arr.push(thread_rng().gen_range(0..10))
}
println!("random {:?}", arr);
arr.iter().unique();
println!("unique {:?}", arr);
}
The output are:
random [7, 0, 3, 6, 7, 7, 1, 1, 8, 6]
unique [7, 0, 3, 6, 7, 7, 1, 1, 8, 6]
So I've tried to get the "no duplicate" result in another variable:
let res = &arr.iter().unique();
The result was:
Unique { iter: UniqueBy { iter: Iter([1, 2, 0, 0, 7, 0, 2, 2, 1, 6]), used: {} } }
Also, it seems I can't sort the array before performing the removal of duplicate. This code returns an error: no method named 'iter' found for unit type '()' in the current scope method not found in '()'
.
arr.sort().iter().unique();
Also, maybe there is a way to achieve the sort+unique value output without external crates?
Usually, sorting an array is an ok way of deduplicating it, but, except if you are using a radix sort (which is not the sorting method Rust uses), it's asymptotically better to do what you would do in JS. Here is the Rust equivalent:
let a_vector = vec![7, 0, 3, 6, 7, 7, 1, 1, 8, 6];
let uniqued_vector = a_vector
.into_iter()
.collect::<HashSet<_>>()
.into_iter()
.collect::<Vec<_>>();
This will turn your array into an iterator, then that iterator into a HashSet
(which will deduplicate it), then back again into an iterator form, then finally into an array.
See it on the playground.
If you wonder why we have to have to go back and forth between these iterator representations, it's because they are the "interface" Rust uses to transform any datatype into any other datatype, very efficiently and while allowing you to perform some operations along the way easily. Here we don't actually need to do anything more than the conversion so that's why it may seem a little bit verbose.
itertools
crateThe itertools
crate provides utilities to work on iterators (the same that we use as an interface to convert between datatypes). However, a peculiarity of iterators is that they are lazy, in a way, in the sense that they, in themselves, are not a datatype used to store information. They only represent operations performed, through the iterable interface, on a collection. For this reason, you actually need to transform an iterator back into a usable collection (or consume it in any way), otherwise it will do nothing (literally).
So the correct version of your code would probably be
let a_vector = vec![7, 0, 3, 6, 7, 7, 1, 1, 8, 6];
let uniqued_vector = a_vector
.into_iter()
.unique()
.collect::<Vec<_>>();
You don't need to sort anything because, internally, .unique()
works pretty much like the first implementation.
As said earlier, sorting the array is fine, so you might still want to do that. However, unlike previous solutions, this won't involve only iterators because you can't sort an iterator (there is no such method provided by the Iterator
trait, nor by the actual type produced by a_vector.into_iter()
)! However, once you have sorted the array, you may want to deduplicate it, that is, remove consecutive repetitions, which is also not provided by the Iterator
trait. However, both of these are actually simply provided by Vec
, so the solution is simply:
let mut a_vector = vec![7, 0, 3, 6, 7, 7, 1, 1, 8, 6];
a_vector.sort_unstable();
a_vector.dedup();
And then a_vector
contains unique elements.
Note that this is only true if you use only the standard library. Itertools provides both a sorted method and a dedup one, so with itertools
you could do:
let a_vector = vec![7, 0, 3, 6, 7, 7, 1, 1, 8, 6];
let uniqued_vector = a_vector
.into_iter()
.sorted_unstable()
.dedup()
.collect::<Vec<_>>();
But at this point you'd be better off using .unique()
.
If you wonder what the difference between .iter()
and .into_iter()
, see this question.