Search code examples
cinliningmultiple-return-values

Efficiently returning multiple values in C


When a C function has to return multiple values, there's a few ways to go about that.

Right now I'm interested in the relative efficiency of two of those methods:
a) bundle the values in a struct foo. Populate a local foo, and return that.
b) pass pointers to be populated.

(I'm working on some legacy code that has a mix of the two.)

For the purposes of this post:

  • All returned values are primitives. Int's, pointer values, etc. So sizeof(foo) is very small.
  • Making struct foo opaque isn't a concern.
  • The functions in question have at most 12 parameters, including any ptr-to-return-value parameters.
  • Assume a somewhat modern compiler, e.g. gcc 11 or later.

Obviously inlining would make the question moot.
Can the different methods affect the compiler's ability to inline?
If not inlined, will there be a performance difference between the two methods?

Can placement of a pointer-to-return-val parameters in the function arguments have an effect? Either on the compiler's ability to inline, or on non-inlined performance?

Edited (a) for clarity.


Solution

  • This is ABI specific.

    On Linux / x86-64, a struct with exactly two words (e.g. two pointers or two intptr_t or two long-s) is returned in two registers. This is a lot faster than e.g. malloc-ing it, and might be faster than writing a two words struct allocated on the call stack by the caller (then it is likely to be in some fast CPU cache; remember that on recent processors a cache miss may take hundreds of nanoseconds, or the time needed for a hundred of register to register integer addition machine instructions)

    But inlining a function is not always faster. You could also use partial evaluation techniques or C++ code generation.

    With a recent GCC compiler, consider also compiling all C or C++ files and linking with link-time optimization (e.g. -flto -O2)