Enabling NRVO when forwarding a function's result via template function

When forwarding a function's result via template function, I encountered different behaviour regarding utilization of "named return value optimization (NRVO)" for clang and gcc. Here a code snippet:

Foo provideFooAsTemporary() 
{
    return Foo{};
}

template <typename TFn>
auto forwardA(TFn &&fn) 
{
    auto result = fn();
    return result;
}

template <typename TFn>
auto forwardB(TFn &&fn) -> decltype(fn()) 
{
    auto result = fn();
    return result;
}

Foo fooA = forwardA(provideFooAsTemporary);  
Foo fooB = forwardB(provideFooAsTemporary);

Here my observations (also see godbolt example):

gcc 14.1: Avoids any move operations for both forwardA and forwardB.
clang 18.0.1: Avoids moving only for forwardB, but a move operation occurs for forwardA.

I would have expected that forwardA triggers NRVO also for clang.

Here my questions:

What exactly is the difference between forwardA and forwardB? Why does the specified trailing return type make a difference?
Is there a simpler way to enable NRVO for clang than what is shown in forwardB?

Solution

Surprisingly this is a known issue in Clang's current implementation of NRVO:

If a function template uses a deduced return type (i.e. auto or decltype(auto)), then NRVO is not applied for any instantiation of that function template.

See e.g. the bug reports

and probably more others.

As far as I can see, there isn't really any reason that application of NRVO should depend on whether or not the return type is explicitly specified. This seems to be a limitation of the current implementation approach.

However, from a conformance point of view, NRVO is never guaranteed to happen and a compiler is always free to choose not to apply it. It can't be forced to do it.