Search code examples
c++pointersshared-ptrabi

How does a double pointer reconcile with a shared_ptr?


Upfront disclosure: I think the entire thing is nonsense and "works" by chance, but I found this code and it seems to "work" for low-enough values of work (as in it does not crash when run, which doesn't mean much), and I don't get why.

The issue at hand is an extern "C" API exposed as a DLL/so, which is then called over FFI (by Python in this case), but the extern "C" code uses shared_ptr. And yet it moves.

The C++ code:

#include <memory>
 
extern "C" {
  int make(std::shared_ptr<int> p) {
    p = std::make_shared<int>(42);
    return 0;
  }
 
  int get(std::shared_ptr<int> p) {
    return *p;
  }
}

the caller:

import ctypes
 
lib = ctypes.CDLL('lib.so')
 
p = ctypes.c_void_p()
lib.make(ctypes.byref(p))
print(lib.get(ctypes.byref(p)))

After building the C++ code as a shared library (named lib.so), the Python code runs fine and does print 42. This code was tested on macOS/ARM64 compiled with Clang but the original code this was munged from reportedly works on Linux/ARM32 (compiled with GCC) and Windows/AMD64 (compiled with msvc).

My working hypothesis is that in all these runtimes shared_ptr happens to have the object pointer as first members, and the compilers decide to pass it by reference in order to avoid the copy (and thus incref/decref), thus make writes the object pointer over Python's p, and writes the control block into space (maybe somewhere on the stack). When the shared pointer is freed the memory remains accessible (possibly because it's in a small object pool / page rather than being unmapped).

Then get does not need to touch the refcount (because gain passed by ref) so it just double-derefs our pointer, which is an UAF but the memory is still around and it works out.

Note: in the original there is no UAF because the shared_ptr is obtained from a longer-lived structure, so this simplified version is a touch worse than the original.


Solution

  • Some speculative facts:

    • A shared_ptr is often 16 bytes (on 64-bit architectures) while void* is 8 bytes. A shared_ptr contains a pointer to the object it refers to, then another pointer to a "control block" containing the refcount and destructor. (This is just one possible implementation of shared_ptr)

    • Overwriting memory doesn't have to immediately lead to a crash; often the memory allocator even allocates more memory than you ask for (e.g. it may round up to a multiple of 16 bytes)

    • Non-trivial class types are passed by reference. Yes, really. Just like you wrote & after the type. I'm not making this up. See for example the Itanium ABI on which a lot of ABIs are based. In order to simulate the parameter not being a reference, the caller makes a copy and then destroys it after the call. You didn't.

    So, probably: You were meant to pass a reference to a shared_ptr to the make function. You did. The make function overwrote it with a real bona-fide shared_ptr. Then, instead of destroying the object like you were supposed to by the ABI (and thereby making it pretend to not be a reference), you passed the same reference to the get function which read the new value assigned inside make.