Search code examples
rustunsafe

When should unsound, private functions be marked unsafe?


The Rustonomicon gives the following example for an unsound, but not unsafe function in the context of a hypothetical Vec implementaiton:

fn make_room(&mut self) {
    // grow the capacity
    self.cap += 1;
}

This function is purposly not marked unsafe, despite being able to break the internal assumptions of Vec, because it is not pub, so the public API remains sound despite it's existance.


With this line of reasoning you could argue that no private function should ever have to be unsafe, because in the end they can't really be worse than this one. (Allows safe code in the same module to produce segfault).

So, when should you mark an unsound, private function as unsafe ?.

As I don't want this to become opinion based, are there any good resources talking about guidelines for this, or could you give some anecdotal evidence from larger projects?


Solution

  • TLDR: Mark every unsound function unsafe, unless you have a very good reason not to.


    As pointed out by @Finomis, quoting from the Rust reference:

    It is the programmer's responsibility when writing unsafe code to ensure that any safe code interacting with the unsafe code cannot trigger these behaviors."

    The following are some concrete patterns and examples where private, unsound functions that are not marked unsafe are used and why.


    1. The unsafe Core Pattern

    Some crates, especially ones implementing data structures, have to carry lots of assumptions about the state of their data. Marking every private function that makes use of these assumptions as unsafe prevents the detection of unsafe patterns like pointer dereferences or unsafe calls, as everything is already unsafe.

    A pattern to get around this issue is to isolate the functions with these assumptions into a separate crate/module with accompanying // SAFETY comments, were it is understood that private functions have to take care to uphold them, even if they aren't marked unsafe. This can sometimes even allow you to enable #![deny(unsafe_code)] for the rest of the codebase.

    An example of this pattern can be seen in the IndexHashMap crate.

    2. Implementing Traits

    When implementing a foreign trait for a private type, (e.g. implementing std::hash::Hash so the type can be used with std::HashMap), the function cannot be marked unsafe, because the trait API requires it not to be. An unsound implementation can still be fine though, if special care is taken to make sure the private type is always in a valid state. (If anybody has a good example for this, please reply with a comment).

    3. Macros

    While macros aren't technically functions, they are often used in similar ways. Somewhat surprisingly, macros cannot be marked unsafe. The general consensus seems to be that macros will always have to be checked carefully, so there's no need for the unsafe keyword. So the ! in the call kind of means unsafe. Still, usecases for private unsound macros exist, meaning that their callsites will not be found by a simple grep for unsafe.

    One example of this can be found in in encoding_rs, were the macro contains a return (so it cannot be a function) based on an unsafe conditional. This is admittedly a bad example, as in this case the macro contains an unsafe block, so it could have been essentially turned into an unsafe macro by leaving out that unsafe block in the macro, so the caller has to place the macro in an unsafe block to make it compile. But this trick would not be possible if the macro was purely unsound (without requiring unsafe).


    Additional Points

    • This might seem obvious, but: A function does not become unsafe through it's callers. If a (private) function foo outputs bytes, and another function bar unsafely calls std::str::from_utf8_unchecked on the output of foo, that is generally no reason to mark foo as unsafe, despite the fact that changing foo could now cause UB. A mechanism to better express these kinds of caller assumptions is unsafe trait.

    • All extern functions are implicitly unsafe. It might seem like one usecase where this would come up are binding for other languages, but here rust elects to be a bit overprotective, forcing you to write a safe wrapper around your highly dangerous extern "C" { fn add(a: i32, b: i32) -> i32}. What does happen in practice though, is that a bit of binding glue code will be written in C/C++ itself, which is arguably no better than an unsafe core, because all code in those languages is inherently unsafe.

    • Function pointers are no valid reason to do this. It might seem like there's a case for it if you want an API like sort_by_key to take a function that's inherently unsafe, but it's always possible to correctly mark that function as unsafe and give a lambda like |x| unsafe {my_actual_fn(x)} to the API instead.

    • Rust has unfortunately made the decision that the a function marked unsafe (meaning it's API is not safe an can be misused) implicitly has an unsafe body (meaning that some safety restrictions on the implementation are lifted), without requiring any unsafe blocks. Fortunately, steps are being taken to rectify this. This change could reduce the need for the unsafe core pattern.