I was looking at String
in the standard library and there was so much unsafe
code like this one:
#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
pub fn remove(&mut self, idx: usize) -> char {
let ch = match self[idx..].chars().next() {
Some(ch) => ch,
None => panic!("cannot remove a char from the end of a string"),
};
let next = idx + ch.len_utf8();
let len = self.len();
unsafe {
ptr::copy(self.vec.as_ptr().add(next), self.vec.as_mut_ptr().add(idx), len - next);
self.vec.set_len(len - (next - idx));
}
ch
}
Why there is so much unsafe
code in the standard library? How is the language still safe?
There is a misconception here that using unsafe
is automatically unsound and will cause memory errors. It does not. In fact, you are not allowed to cause memory errors even in unsafe
code blocks; if you do, then the code will exhibit undefined behavior and the whole program is ill-defined. The point of unsafe
is to allow things that the compiler cannot ensure are actually safe. That responsibility falls to the developer to ensure the code does not invoke undefined behavior by understanding the safety requirements required to use unsafe
syntax, functions, and other items.
The design philosophy for writing and using unsafe
functions is if some set of parameters or circumstances may cause a function to exhibit undefined behavior, then it must be marked unsafe
and should be documented what the safe parameters and circumstances are. The caller must then abide by this documentation within an unsafe
block. The flip side of this design philosophy is that if a function is not marked unsafe
, then no possible parameters or circumstances may cause undefined behavior.
In this situation, shifting bytes around in memory is not always safe so you must use unsafe
to call ptr::copy
. However, the method .remove()
is not marked unsafe
so whatever happens in the unsafe
block must be safe if the developers of the Rust standard library have done their job, and I'm sure they have. You can see that any possible input is bounds-checked and what is being copy
'd is within the already allocated block. The only way this could cause undefined behavior is if there was already undefined behavior or broken invariants before calling this function.
You cannot build the Rust standard library without using unsafe
. The underlying manual memory management that computers are based on is inherently fraught with memory foot-guns, however you can build off of these "unsafe" operations with guarantees that make them safe.
Some of the unsafe
'ty is required, but other instances are simply for performance reasons. Safe abstractions may require many checks to ensure they are safe, especially if any kind of dynamicism is involved, but if your existing invariants are encoded correctly, then using unsafe
can avoid those checks while still being safe. In this function, it probably could have been done entirely safely by just relying on other self.vec
methods (which would have unsafe
internally at some point), but it may include additional bounds checks that would be entirely unnecessary.
The standard library is expected to operate with as little overhead as possible, while staying safe (unless the function is marked unsafe
of course).