While learning rust
, I ran across this snippet embedded within an answer to an online problem. I didn't expect this to be the correct answer as I naively thought elem_refs.sort()
would sort by the references in the vector rather than the strings.
Why does elem_refs.sort() sort on the strings rather the the references to the strings (or rather &str
's) ? Obviously, it is convenient and the desired result but where is this behavior documented? Much obliged for your insights.
fn main() {
let list = ["b", "d" , "q", "a", "t" ];
let mut elem_refs: Vec<&&str> = list.iter().collect();
println!("{:?}", elem_refs);
elem_refs.sort();
println!("{:?}", elem_refs);
}
["b", "d", "q", "a", "t"]
["a", "b", "d", "q", "t"]
To my understanding, there are two answers to this question: the "philosophical", and the practical.
The "philosophical" answer, is that Rust prefers to deal with data, and not pointers.
Even if you have a reference to a value, it's often treated as just a handle to the value, and the fact that behind the scenes a reference is just a pointer, with a specific address in memory, is a detail that Rust prefers you not think about too much. (At least, this is my understanding of Rust philosophy. Feel free to correct me)
Even if you have two references, comparing them using ==
would compare them by-value (Due to a call to PartialEq::eq
). If you really want to compare them by-reference, you would need to either convert them to pointers first, or call a function like std::ptr::addr_eq
. (And to my understanding, even the idea of comparison by-reference is considered un-idiomatic in Rust)
Now the practical answer, is that the implementation of sort
uses the Ord
trait to decide the order of the elements in the sorted slice. (look at the where
clause on the signature of sort
)
The Ord
trait (and the related PartialOrd
trait) exists for the purpose of deciding the order of types by-value, and indeed the implementation of Ord
for str
calls self.as_bytes().cmp(other.as_bytes())
to compare the actual bytes contained in the strings.
Somewhat related to this, is the behaviour of the dot operator (.
) in Rust, which is used to access the fields and methods of a type. Using the dot operator on a variable always implicitly dereferences that variable if it is a reference, before actually accessing the field/method.
It will even dereference it multiple times if it's a nested reference. This means that for any type T
(even primitive types like i32
), if you have a variable let my_var: &&&T = ...
, and you do something like my_var.some_field
, that would be equivalent to (***my_var).some_field
.
This in turn means, that any method being called on a reference (like the methods from Ord
that are called by the implementation of sort
), will deal with the actual data inside the reference, and not the reference itself.
A disclaimer that I'd like to add is that I'm somewhat new to Rust, and so not everything I said might be 100% correct. I'm open to comments on this answer correcting me, so that I can edit this answer with the corrections.