When trying to optimize return values on x86_64, I noticed a strange thing. Namely, given the code:
#include <cstdint>
#include <tuple>
#include <utility>
using namespace std;
constexpr uint64_t a = 1u;
constexpr uint64_t b = 2u;
pair<uint64_t, uint64_t> f() { return {a, b}; }
tuple<uint64_t, uint64_t> g() { return tuple<uint64_t, uint64_t>{a, b}; }
Clang 3.8 outputs this assembly code for f
:
movl $1, %eax
movl $2, %edx
retq
and this for g
:
movl $2, %eax
movl $1, %edx
retq
which look optimal. However, when compiled with GCC 6.1, while the generated assembly for f
is identical to what Clang output, the assembly generated for g
is:
movq %rdi, %rax
movq $2, (%rdi)
movq $1, 8(%rdi)
ret
It looks like the type of the return value is classified as MEMORY by GCC but as INTEGER by Clang. I can confirm that linking Clang code with GCC code such code can result in segmentation faults (Clang calling GCC-compiled g()
which writes to wherever %rdi
happens to point) and an invalid value being returned (GCC calling Clang-compiled g()
). Which compiler is at fault?
As davmac's answer shows, the libstdc++ std::tuple
is trivially copy constructible, but not trivially move constructible. The two compilers disagree on whether the move constructor should affect the argument passing conventions.
The C++ ABI thread you linked to seems to explain that disagreement: http://sourcerytools.com/pipermail/cxx-abi-dev/2016-February/002891.html
In summary, Clang implements exactly what the ABI spec says, but G++ implements what it was supposed to say, but wasn't updated to actually say.