I was learning Rust with a book and the following excerpt threw me a bit off:
Also note that
&str has the
& in front of it because you need a reference to use a
str. That's because of the reason we saw above: the stack needs to know the size, and a
str can be of any length. So we access it with a
&, a reference. The compiler knows the size of a reference's pointer, and it can then use the
& to find where the
str data is and read it. Also, because you use a
& to interact with a
str, you don't own it. But a
String is an "owned" type.
I understand that for variables of unknown size, you must place the data on the heap and then reference to it with a fixed-length pointer on the stack. My confusion lies with the statement that
str can be of any length.
Why can't a
String type also be of unknown length at times and require the whole reference to data on heap approach?
I understand that the book will probably dive deeper into the details later on, but I was wondering if someone could already provide some additional context for me, specifically regarding the question above? Any useful accompanying details regarding the
String types in Rust, that are good to know for a beginner to the language, are highly appreciated as well.
Like a slice
str is a variably-sized type. (In fact,
str is essentially a
[u8] guaranteed to contain valid UTF-8.)
Variably-sized types are special. They do not implement the
Sized trait. A reference to a variably-sized type is "fat": it doesn't just hold the address of the referenced thing, but also its size.
str therefore means "some area in memory which contains valid UTF-8 data". And
&str is "the address and size of such an area".
String on the other hand is a struct with a fixed size. One of its members is a pointer to string data somewhere else (on the heap). Conceptually, a
String contains a
&str along with the unused capacity of the memory area. (In reality, a
String is a wrapper around a
Vec<u8> with UTF-8 guarantee, a
Vec<u8> conceptually contains a
&[u8] plus capacity but is really a raw pointer, size and capacity.)
The total memory required by a
String is therefore still variable, but the part that is the
String struct itself is known.
Why is it this way? Because the entire point of
String is to manage a memory region containing string data, and it can't do that if it is the memory region containing string data.
I understand that for variables of unknown size, you must place the data on the heap
This is a misconception. The heap is the most obvious place to put variably-sized data, but
alloca equivalent to allocate variably-sized data on the stack.