Search code examples
c++unique-ptrmove-semantics

Confused about returning std::unique_ptr


So say I have a tree structure, with a node defined like:

struct Node
{
    byte key_part;
    unique_ptr<Node> left, right, middle;
    TValue value;
};
unique_ptr<Node> root;

The semantics here are correct, I absolutely don't want others to get a hold of my node itself. It is uniquely held by my tree.

However, I want to add an internal helper function to search the tree based on internal semantics and return a ...something? to the node structure. This is a private function, only used internally to either insert a node and fill it in with data from outside, delete a node or return the value field (not the node itself) outside. The node in no way leaks outside.

So how do I write the function? I can't return unique_ptr<Node> from it, right? Or move semantics will kick in. I could return a pointer, but that seems to break the system, plus it leads to aliasing issues. A reference is similar, but it prevents me from using nullptr to mark no result found. In essence, what would the return type of this be:

/* return type here */ search(const span<byte>& key)
{
    auto node = /* recursive search here, etc */;
    return /* what do I put here, given I have a unique_ptr<Node> reference? */;
}

Solution

  • However, I want to add an internal helper function to search the tree based on internal semantics and return a ...something?

    Just return a raw pointer. Raw pointers are the canonical representation of a non-owning nullable reference, which is exactly what you want.

    For reassurance, see the C++ Core Guidelines on this subject.

    I could return a pointer, but that seems to break the system, plus it leads to aliasing issues.

    It doesn't break anything.

    It doesn't cause aliasing issues either - these arise when you have:

    • two or more pointers (to the same or pointer-interconvertible types)
    • which the compiler has to assume may overlap (ie, it can't prove they will never overlap)
    • where you write through one, and read through the other
    • so the compiler has to assume the write could have changed the value it should read, so cannot cache that read
    • but you know the pointers really will never overlap and would prefer the optimizer to cache that read

    The classic example of this is the distinction between memcpy (which assumes non-overlapping ranges) and memmove (which has to take extra precautions to deal with possibly-overlapping ranges).

    Whether you use a raw pointer or a smart one doesn't affect that at all.