Search code examples
c++c++17language-lawyerimplicit-conversionoverload-resolution

Compilers work differently for overload resolution with user-defined conversion to rvalue reference


I encountered a weird behavior in that gcc and clang select different overloaded constructors when their arguments are implicitly converted from user-defined conversion operators.

The concerning code is here:

#include <cstdio>
#include <utility>
#include <type_traits>

template <typename T>
class foo {
  T& t_;
public:
  foo(T& t) : t_(t) {}
  operator T() const & { return t_; }
  operator T&&() && { return std::move(t_); }
};

class bar {
  int val_;
public:
  bar(int v) : val_(v) {}
  bar(const bar& b) : val_(b.val_) { printf("copy constructed\n"); }
  bar& operator=(const bar& b)     { printf("copy assigned\n"); val_ = b.val_; return *this; }
  bar(bar&& mv) : val_(mv.val_)    { printf("move constructed\n"); mv.val_ = -1; }
  bar& operator=(bar&& mv)         { printf("move assigned\n"); val_ = mv.val_; mv.val_ = -1; return *this; }
};

int main() {
  bar v(1);
  foo<bar> f(v);
  bar v2(std::move(f));
}

The class foo is a wrapper for type T, which is expected to be implicitly converted to T or T&& if it is an rvalue reference (when converted by std::move).

However, some compilers prefer T rather than T&&, even though it is an rvalue reference. The result looks like:

⟩ clang++-15 -std=c++17 test.cpp && ./a.out
copy constructed

⟩ clang++-15 -std=c++14 test.cpp && ./a.out
move constructed

⟩ g++ -std=c++14 test.cpp && ./a.out
move constructed

⟩ g++ -std=c++17 test.cpp && ./a.out
move constructed

Only with clang and -std=c++17 or higher, the copy constructor is preferred to the move constructor.

Compiler versions:

⟩ g++ --version
g++ (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

⟩ clang++-15 --version
Ubuntu clang version 15.0.4-++20221102053308+5c68a1cb1231-1~exp1~20221102053355.92
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Why is the precedence different across different compilers and C++ versions? Or am I violating some rule?


Solution

  • In

    bar v2(std::move(f));
    

    the top-level overload resolution is between the copy constructor and move constructor of bar. bar does have a third constructor, which takes int, but obviously this is not the selected one, so I'm going to disregard it.

    Whichever conversion function of foo<bar> gets selected, the result of that conversion function will be an rvalue, so bar's move constructor will be preferred. With that being said, I'm also going to disregard the copy constructor without delving into the standardese. (It does not appear that the use of the move constructor is part of the controversy here. Although the OP's program prints "copy constructed", this copy construction is done by operator T; the copy constructor is not the one invoked on v2 itself.) The question is what implicit conversion sequence will be used to convert an rvalue of type foo<bar> to the argument type bar&&. All references will be to the C++17 standard since the question was tagged [c++17].

    According to [over.best.ics]/1, the implicit conversion sequence is "governed by the rules for initialization of an object or reference by a single expression". We thus have to consult [dcl.init.ref], which governs the initialization of references. The reference being initialized has type bar&&, and the initialization is a copy-initialization from an rvalue of type foo<bar>. The case that is reached is in p5.2.1 of that section:

    If the initializer expression

    • [inapplicable case elided], or
    • has a class type (i.e., T2 is a class type), where T1 is not reference-related to T2, and can be converted to an rvalue or function lvalue of type "cv3 T3", where "cv1 T1" is reference-compatible with "cv3 T3" (see 16.3.1.6),

    then [...] the result of the conversion [...] is called the converted initializer. If the converted initializer is a prvalue, its type T4 is adjusted to type "cv1 T4" (7.5) and the temporary materialization conversion (7.4) is applied. In any case, the reference is bound to the resulting glvalue (or to an appropriate base class subobject).

    We know that the initializer can be converted to either an xvalue or prvalue of type bar, and bar is reference-compatible with itself, so the conversion will be performed and the reference will be bound to the result of that conversion. The only question is how that conversion is done: by operator bar or operator bar&&?

    To answer that question, we have to look at the referenced section, 16.3.1.6, also known as [over.match.ref]:

    Under the conditions specified in 11.6.3, a reference can be bound directly to a glvalue or class prvalue that is the result of applying a conversion function to an initializer expression. Overload resolution is used to select the conversion function to be invoked. Assuming that "reference to cv1 T" is the type of the reference being initialized, and "cv S" is the type of the initializer expression, with S a class type, the candidate functions are selected as follows:

    • The conversion functions of S and its base classes are considered. Those non-explicit conversion functions that are not hidden within S and yield type "lvalue reference to cv2 T2" (when initializing an lvalue reference or an rvalue reference to function) or "cv2 T2" or "rvalue reference to cv2 T2" (when initializing an rvalue reference or an lvalue reference to function), where "cv1 T" is reference-compatible (11.6.3) with "cv2 T2", are candidate functions. For direct-initialization, [...]

    The argument list has one argument, which is the initializer expression. [Note: This argument will be compared against the implicit object parameter of the conversion functions. —end note]

    Here, T is the cv-unqualified referenced type, that is, bar. There is one conversion function that yields bar (i.e., T2, where T2 is bar), and one that yields bar&& (i.e., "rvalue reference to T2", where T2 is bar). Since bar is reference-compatible with T2 (which is bar itself) in both cases, both conversion functions are candidates. Overload resolution must be done in order to determine which one to call. Here, the argument is still std::move(f), that is, an rvalue of type foo<bar>, but the parameter is the implied object parameter:

    • for operator T, the implied object parameter has type foo<bar> const & since the operator was declared with const &.
    • for operator T&&, the implied object parameter has type foo<bar>&& since the operator was declared with &&.

    To perform the overload resolution, we consider the respective implicit conversion sequences and then try to determine whether one is better than the other. Both are identity conversions, since an rvalue of foo<bar> can be bound directly to either foo<bar> const & or foo<bar>&& (see [over.ics.ref]/1). But as we know, binding an rvalue reference to an rvalue is better than binding an lvalue reference to an rvalue. This is rule [over.ics.rank]/3.2.3:

    Standard conversion sequence S1 is a better conversion sequence than standard conversion sequence S2 if

    • [...] or, if not that,
    • [...] or, if not that,
    • S1 and S2 are reference bindings (11.6.3) and neither refers to an implicit object parameter of a non-static member function declared without a ref-qualifier, and S1 binds an rvalue reference to an rvalue and S2 binds an lvalue reference or, if not that, [...]

    (The proviso here does not apply; both functions do have ref-qualifiers).

    Since the implicit conversion sequence for the implicit object parameter is better in the case of operator T&& than operator T, the former is the best viable function.

    GCC is correct. operator T&& should be called, followed by the move constructor. Clang in C++17 mode is calling operator T (which internally performs a copy construction). There is no call to the move constructor, because Clang has elided it.

    This is our clue that Clang's behaviour is probably related to core issue 2327. I think the Clang maintainers have probably optimistically implemented some proposed resolution to this issue (although it does not appear that Richard Smith has provided detailed wording). If the issue is resolved in a way that is consistent with the behaviour that you're seeing, and the committee approves it as a defect report against C++17, Clang's behaviour will be considered correct. If, on the other hand, it is resolved in a way that is consistent with GCC's behaviour, Clang will probably have to change.