So let's say I have a String
, "Foo Bar"
and I want to create a substring of "Bar"
without allocating new memory.
So I moved the raw pointer of the original string to the start of the substring (in this case offsetting it by 4) and use the String::from_raw_parts() function to create the String.
So far I have the following code, which as far as I understand should do this just fine. I just don't understand why this does not work.
use std::mem;
fn main() {
let s = String::from("Foo Bar");
let ptr = s.as_ptr();
mem::forget(s);
unsafe {
// no error when using ptr.add(0)
let txt = String::from_raw_parts(ptr.add(4) as *mut _, 3, 3);
println!("{:?}", txt); // This even prints "Bar" but crashes afterwards
println!("prints because 'txt' is still in scope");
}
println!("won't print because 'txt' was dropped",)
}
I get the following error on Windows:
error: process didn't exit successfully: `target\debug\main.exe` (exit code: 0xc0000374, STATUS_HEAP_CORRUPTION)
And these on Linux (cargo run; cargo run --release):
munmap_chunk(): invalid pointer
free(): invalid pointer
I think it has something to do with the destructor of String, because as long as txt
is in scope the program runs just fine.
Another thing to notice is that when I use ptr.add(0)
instead of ptr.add(4)
it runs without an error.
Creating a slice didn't give me any problems on the other Hand. Dropping that worked just fine.
let t = slice::from_raw_parts(ptr.add(4), 3);
In the end I want to split an owned String in place into multiple owned String
s without allocating new memory.
Any help is appreciated.
The reason for the errors is the way that the allocator works. It is Undefined Behaviour to ask the allocator to free a pointer that it didn't give you in the first place. In this case, the allocator allocated 7 bytes for s
and returned a pointer to the first one. However, when txt
is dropped, it tells the allocator to deallocate a pointer to byte 4, which it has never seen before. This is why there is no issue when you add(0)
instead of add(4)
.
Using unsafe
correctly is hard, and you should avoid it where possible.
Part of the purpose of the &str
type is to allow portions of an owned string
to be shared, so I would strongly encourage you to use those if you can.
If the reason you can't just use &str
on its own is because you aren't able to track the lifetimes back to the original String
, then there are still some solutions, with different trade-offs:
Leak the memory, so it's effectively static:
let mut s = String::from("Foo Bar");
let s = Box::leak(s.into_boxed_str());
let txt: &'static str = &s[4..];
let s: &'static str = &s[..4];
Obviously, you can only do this a few times in your application, or else you are going to use up too much memory that you can't get back.
Use reference-counting to make sure that the original String
stays around long enough for all of the slices to remain valid. Here is a sketch solution:
use std::{fmt, ops::Deref, rc::Rc};
struct RcStr {
rc: Rc<String>,
start: usize,
len: usize,
}
impl RcStr {
fn from_rc_string(rc: Rc<String>, start: usize, len: usize) -> Self {
RcStr { rc, start, len }
}
fn as_str(&self) -> &str {
&self.rc[self.start..self.start + self.len]
}
}
impl Deref for RcStr {
type Target = str;
fn deref(&self) -> &str {
self.as_str()
}
}
impl fmt::Display for RcStr {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
fmt::Display::fmt(self.as_str(), f)
}
}
impl fmt::Debug for RcStr {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
fmt::Debug::fmt(self.as_str(), f)
}
}
fn main() {
let s = Rc::new(String::from("Foo Bar"));
let txt = RcStr::from_rc_string(Rc::clone(&s), 4, 3);
let s = RcStr::from_rc_string(Rc::clone(&s), 0, 4);
println!("{:?}", txt); // "Bar"
println!("{:?}", s); // "Foo "
}