clang-tidy: `Loop variable is copied but only used as const reference; consider making it a const reference` - does it really matter?

I'm working on code that clang-tidy is flagging all over the place with

Loop variable is copied but only used as const reference; consider making it a const reference

Current code:

for (auto foo : collection) {
    ...
}

What clang-tidy suggests I use:

for (const auto &foo : collection) {
    ...
}

I can certainly see where a const reference would be more efficient than actually copying the value, but I'm wondering if the compiler will just optimize it that way anyway? I'd hate to dig in and change hundreds of loops all for nothing.

Solution

Constructors can have side-effects beyond constructing the new object. The semantics of the two versions could therefore differ.

Even if that is not the case, the compiler might not be able to determine that during compilation. For example if the copy constructor of the foo type is not defined in the same translation unit and no link-time optimization is used, the compiler wouldn't be able to optimize away the copy, since it might have side-effects that are unknown at compilation time of the current translation unit.

Similarly the loop body might call functions for which the compiler cannot prove that they don't modify foo, in which case the copy can also not be optimized away.

(This is by the way harder for the compiler to prove than it is for clang-tidy to give you the warning, since even a function taking a const reference of foo is technically still allowed to modify it. And even if the foo object itself is const-qualified (e.g. const auto), functions may depend on the address of the foo object being different than that of the container's foo object through other program paths.)

Even if everything is visible in the translation unit and the observable behavior doesn't depend on the copy, the operations might be too complex for the compiler to optimize the copy away.

Example:

auto f(const std::vector<std::string>& x) {
    std::size_t n = 0;
    for(auto y : x)
        n += y.size();
    return n;
}

Both GCC and Clang with -O3 and libstdc++ don't optimize the copies away (see the operator new/operator delete/memcpy calls in the assembly): https://godbolt.org/z/d66ac617M

Also compare to the same code with references instead: https://godbolt.org/z/Tdc3GhcEv

However, in the above example there might still be a visibility issue with the standard library implementation of std::string. Here is maybe a better example in which all definitions are probably visible in the current translation unit:

struct A { int a[64]; };

auto f(const std::vector<std::vector<A>>& x) {
    std::size_t n = 0;
    for(auto y : x)
        n += y.size();
    return n;
}

In this case, GCC still doesn't optimize the copy away, while Clang does, although it still leaves some allocation size check in there for some reason: https://godbolt.org/z/8KWPTx4o5

But if the type is made more complex there will probably be a point at which neither compiler will be able to optimize away the copy even though everything is visible.

If you know that the type of foo is simple, e.g. a scalar like on a std::vector<int> container, then it doesn't really matter in terms of performance, assuming the observable behavior is the same.

You might still want to use the const reference, more specifically the const part, for the purpose of keeping const-correctness throughout your code. How necessary that is, probably goes into opinion-based territory though.