Search code examples
rustmemory-safety

Why does the Rust documentation say that sharing a reference to a vector would create an invalid vector even though the vector is on the heap?


The following is an excerpt from The Rust Programming Language chapter on ownership:

Now consider the following code fragment:

let v = vec![1, 2, 3];

let mut v2 = v;

The first line allocates memory for the vector object v on the stack like it does for x above. But in addition to that it also allocates some memory on the heap for the actual data ([1, 2, 3]). Rust copies the address of this heap allocation to an internal pointer, which is part of the vector object placed on the stack (let's call it the data pointer).

It is worth pointing out (even at the risk of stating the obvious) that the vector object and its data live in separate memory regions instead of being a single contiguous memory allocation (due to reasons we will not go into at this point of time). These two parts of the vector (the one on the stack and one on the heap) must agree with each other at all times with regards to things like the length, capacity, etc.

When we move v to v2, Rust actually does a bitwise copy of the vector object v into the stack allocation represented by v2. This shallow copy does not create a copy of the heap allocation containing the actual data. Which means that there would be two pointers to the contents of the vector both pointing to the same memory allocation on the heap. It would violate Rust’s safety guarantees by introducing a data race if one could access both v and v2 at the same time.

For example if we truncated the vector to just two elements through v2:

v2.truncate(2);

and v were still accessible we'd end up with an invalid vector since v would not know that the heap data has been truncated. Now, the part of the vector v on the stack does not agree with the corresponding part on the heap. v still thinks there are three elements in the vector and will happily let us access the non existent element v[2] but as you might already know this is a recipe for disaster. Especially because it might lead to a segmentation fault or worse allow an unauthorized user to read from memory to which they don't have access.

After truncating the vector using v2, the truncated values are updated on the heap memory. v1 still sees the heap memory and after truncation, it sees the new values. So why does the book say

and v were still accessible we'd end up with an invalid vector since v would not know that the heap data has been truncated


Solution

  • What you're missing is that a Vec contains both a pointer to the heap and a len, or length of the heap's data. If v2 truncates the Vec, it's possible that the truncated data has been freed, and v1 still believes that the truncated memory is still part of the vector.