abi::__cxa_demangle -- why buffer needs to be `malloc`-ed?

The documentation of abi::__cxa_demangle (such as https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html) specifies that the second argument, char * output_buffer, need to be malloc-ed.

That means that a character buffer allocated on the stack such as the following is not allowed.

  enum {N = 256};
  char output_buffer[N];
  size_t output_length = 0;
  int status = -4;
  char * const result = std::__cxa_demangle(mangled_name,
                        output_buffer, &output_length, &status);

Two questions:

Why is an output_buffer on stack not allowed?
Why is a different pointer returned when an output buffer was already passed?

Influenced by the example of backtrace(), I would have imagined an API like the following

// Demangle the symbol in 'mangled_name' and store the output
// in 'output_buffer' where 'output_buffer' is a caller supplied
// buffer of length 'output_buffer_length'. The API returns the 
// number of bytes written to 'output_buffer' which is not
// greater than 'output_buffer_length'; if it is
// equal to 'output_buffer_length', then output may have been
// truncated.
size_t mydemangle(char const * const mangled_name,
                  char * output_buffer,
                  size_t const output_buffer_length);

Solution

1) Why is an output_buffer on stack not allowed?

From the link you provided. If output_buffer is not long enough, it is expanded using realloc. It is not possible to resize data on the stack since a stack frame is generally of fixed size (a special case alloca)

2) Why is a different pointer returned when an output buffer was already passed?

When realloc is used, there's no reason to think you will get back the same pointer. For example, if there is not enough contiguous memory free at that location the operating system would need to allocate the memory somewhere else.

If I had to guess why the API was designed this way, it would be that it's usually considered good practice to not allocate memory in a function and then return references to that memory. Instead, make the caller responsible for both allocation and deallocation. This is helps avoid unexpected memory leaks and allows a user of the API to design their own memory allocation schemes. I appreciate such things because it allows the user to utilize their own memory management schemes to avoid things like memory fragmentation. The potential use of realloc kind of messes this idea up though, but you could probably work around this by allocating large enough blocks for the output parameter so that realloc is never called.