I'm learning Rust and I'm trying to solve an advent of code challenge (day 9 2015).
I created a situation where I end up with a variable that has the type Vec<&&str>
(note the double '&', it's not a typo). I'm now wondering if this type is different than Vec<&str>
. I can't figure out if a reference to a reference to something would ever make sense. I know I can avoid this situation by using String
for the from
and to
variables. I'm asking if Vec<&&str> == Vec<&str>
and if I should try and avoid Vec<&&str>
.
Here is the code that triggered this question:
use itertools::Itertools
use std::collections::{HashSet};
fn main() {
let contents = fs::read_to_string("input.txt").unwrap();
let mut vertices: HashSet<&str> = HashSet::new();
for line in contents.lines() {
let data: Vec<&str> = line.split(" ").collect();
let from = data[0];
let to = data[2];
vertices.insert(from);
vertices.insert(to);
}
// `Vec<&&str>` originates from here
let permutations_iter = vertices.iter().permutations(vertices.len());
for perm in permutations_iter {
let length_trip = compute_length_of_trip(&perm);
}
}
fn compute_length_of_trip(trip: &Vec<&&str>) -> u32 {
...
}
I'm now wondering if this type is different than
Vec<&str>
.
Yes, a Vec<&&str>
is a type different from Vec<&str>
- you can't pass a Vec<&&str>
where a Vec<&str>
is expected and vice versa. Vec<&str>
stores string slice references, which you can think of as pointers to data inside some strings. Vec<&&str>
stores references to such string slice references, i.e. pointers to pointers to data. With the latter, accessing the string data requires an additional indirection.
However, Rust's auto-dereferencing makes it possible to use a Vec<&&str>
much like you'd use a Vec<&str>
- for example, v[0].len()
will work just fine on either, v[some_idx].chars()
will iterate over chars with either, and so on. The only difference is that Vec<&&str>
stores the data more indirectly and therefore requires a bit more work on each access, which can lead to slightly less efficient code.
Note that you can always convert a Vec<&&str>
to Vec<&str>
- but since doing so requires allocating a new vector, if you decide you don't want Vec<&&str>
, it's better not to create it in the first place.
Since a &str
is Copy
, you can avoid the creation of Vec<&&str>
by adding a .copied()
when you iterate over vertices
, i.e. change vertices.iter()
to vertices.iter().copied()
. If you don't need vertices
sticking around, you can also use vertices.into_iter()
, which will give out &str
, as well as free vertices
vector as soon as the iteration is done.
The reason why the additional reference arises and the ways to avoid it have been covered on StackOverflow before.
There is nothing inherently wrong with Vec<&&str>
that would require one to avoid it. In most code you'll never notice the difference in efficiency between Vec<&&str>
and Vec<&str>
. Having said that, there are some reasons to avoid it beyond performance in microbenchmarks. The additional indirection in Vec<&&str>
requires the exact &str
s it was created from (and not just the strings that own the data) to stick around and outlive the new collection. This is not relevant in your case, but would become noticeable if you wanted to return the permutations to the caller that owns the strings. Also, there is value in the simpler type that doesn't accumulate a reference on each transformation. Just imagine needing to transform the Vec<&&str>
further into a new vector - you wouldn't want to deal with Vec<&&&str>
, and so on for every new transformation.
Regarding performance, less indirection is usually better since it avoids an extra memory access and increases data locality. However, one should also note that a Vec<&str>
takes up 16 bytes per element (on 64-bit architectures) because a slice reference is represented by a "fat pointer", i.e. a pointer/length pair. A Vec<&&str>
(as well as Vec<&&&str>
etc.) on the other hand takes up only 8 bytes per element, because a reference to a fat reference is represented by a regular "thin" pointer. So if your vector measures millions of elements, a Vec<&&str>
might be more efficient than Vec<&str>
simply because it occupies less memory. As always, if in doubt, measure.