I've taken this picture and code from The Rust Book.
Why does s
point to s1
rather than just the data on the heap itself?
If so this is how it works? How does the s
point to s1
. Is it allocated memory with a ptr field that contains the memory address of s1
. Then, does s1
, in turn point to the data.
In s1
, I appear to be looking at a variable with a pointer, length, and capacity. Is only the ptr
field the actual pointer here?
This is my first systems level language, so I don't think comparisons to C/C++ will help me grok this. I think part of the problem is that I don't quite understand what exactly pointers are and how the OS allocates/deallocates memory.
fn main() {
let s1 = String::from("hello");
let len = calculate_length(&s1);
println!("The length of '{}' is {}.", s1, len);
}
fn calculate_length(s: &String) -> usize {
s.len()
}
u64
).Here an example of these concepts (playground):
// This is, in real program, implicitly defined,
// but for the sake of example made explicit.
// If you want to play around with the example,
// don't forget to replace `usize::max_value()`
// with a smaller value.
let memory = [uninitialized::<u8>(); usize::max_value()];
// Every value of `usize` type is valid address.
const SOME_ADDR: usize = 1234usize;
// Any address can be safely binded to a pointer,
// which *may* point to both valid and invalid memory.
let ptr: *const u8 = transmute(SOME_ADDR);
// You find an offset in our memory knowing an address
let other_ptr: *const u8 = memory.as_ptr().add(SOME_ADDR);
// Oversimplified allocation, in real-life OS gives a block of memory.
unsafe { *other_ptr = 15; }
// Now it's *meaningful* (i.e. there's no undefined behavior) to make a reference.
let refr: &u8 = unsafe { &*other_ptr };
I hope that clarify most things out, but let's cover the questions explicitly though.
Why does
s
point tos1
rather than just the data on the heap itself?
s
is a reference (i.e. valid pointer), so it points to the address of s1
. It might (and probably would) be optimized by a compiler for being the same piece of memory as s1
, logically it still remains a different object that points to s1
.
How does the
s
point tos1
. Is it allocated memory with aptr
field that contains the memory address ofs1
.
The chain of "pointing" still persists, so calling s.len()
internally converted to s.deref().len
, and accessing some byte of the string array converted to s.deref().ptr.add(index).deref()
.
There are 3 blocks of memory that are displayed on the picture: &s
, &s1
, s1.ptr
are different (unless optimized) memory addresses. And all of them are stored in the allocated memory. The first two are actually stored at pre-allocated (i.e. before calling main
function) memory called stack and usually it is not called an allocated memory (the practice I ignored in this answer though). The s1.ptr
pointer, in contrast, points to the memory that was allocated explicitly by a user program (i.e. after entering main
).
In
s1
, I appear to be looking at a variable with a pointer, length, and capacity. Is only theptr
field the actual pointer here?
Yes, exactly. Length and capacity are just common unsigned integers.