Search code examples
c++language-lawyerpass-by-referencepass-by-valueclang-tidy

Is clang-tidy modernize-pass-by-value only right from C++20 on?


There are many questions and answers about whether to pass arguments by value or by reference. This answer https://stackoverflow.com/a/51706522/2492801 seems to indicate that passing by value and moving is not optimal for lvalues. It seems as if this benchmark for C++17 proves it: https://quick-bench.com/q/8pCCiw5tBtGwh8XaR9EsBTLZTzw. The case "BM_PassLvalueByValueAndMove" is the slowest of all. The cases where the lvalue is passed by reference ("BM_PassLvalueByReferenceAndCopy" and "BM_PassLValueToStructOfferingBoth") are faster.

But the clang-tidy check "modernize-pass-by-value" unconditionally proposes to pass by value in constructors: https://clang.llvm.org/extra/clang-tidy/checks/modernize/pass-by-value.html. Isn't this a bad advice? It may be beneficial for rvalues, but has downsides for lvalues.

But actually this also seems to depend on the language standard. When I change "C++17" to "C++20" in the benchmark (see https://quick-bench.com/q/U9di84h6HaNQw7urp6qEiTjYPMI), suddenly the case "BM_PassLvalueByValueAndMove" is not slower than the other lvalue cases anymore. This remains true for "C++23".

Does that mean that clang-tidy is right with its proposal, but only from C++20 on? What change in the standard has made passing by value and moving better also for lvalues?

Benchmark code:

#include <benchmark/benchmark.h>

#include <string>
#include <utility>

struct PassByValueAndMove
{
    explicit PassByValueAndMove(std::string string)
        : m_string(std::move(string))
    {
    }

    std::string m_string;
};

struct PassByReferenceAndCopy
{
    explicit PassByReferenceAndCopy(const std::string& string)
        : m_string(string)
    {
    }

    std::string m_string;
};

struct PassBoth
{
    explicit PassBoth(std::string&& string)
        : m_string(std::move(string))
    {
    }

    explicit PassBoth(const std::string& string)
        : m_string(string)
    {
    }

    std::string m_string;
};

static void BM_PassRvalueByReferenceAndCopy(benchmark::State& state)
{
    for (auto _ : state)
    {
        static_cast<void>(_);

        {
            PassByReferenceAndCopy instance("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");
            benchmark::DoNotOptimize(instance);
        }
    }
}

BENCHMARK(BM_PassRvalueByReferenceAndCopy);

static void BM_PassRvalueByValueAndMove(benchmark::State& state)
{
    for (auto _ : state)
    {
        static_cast<void>(_);

        {
            PassByValueAndMove instance("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");
            benchmark::DoNotOptimize(instance);
        }
    }
}

BENCHMARK(BM_PassRvalueByValueAndMove);

static void BM_PassLvalueByReferenceAndCopy(benchmark::State& state)
{
    for (auto _ : state)
    {
        static_cast<void>(_);

        {
            const std::string      dummy_string = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
            PassByReferenceAndCopy instance(dummy_string);
            benchmark::DoNotOptimize(instance);
        }
    }
}

BENCHMARK(BM_PassLvalueByReferenceAndCopy);

static void BM_PassLvalueByValueAndMove(benchmark::State& state)
{
    for (auto _ : state)
    {
        static_cast<void>(_);

        {
            const std::string  dummy_string = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
            PassByValueAndMove instance(dummy_string);
            benchmark::DoNotOptimize(instance);
        }
    }
}

BENCHMARK(BM_PassLvalueByValueAndMove);

static void BM_PassRvalueToStructOfferingBoth(benchmark::State& state)
{
    for (auto _ : state)
    {
        static_cast<void>(_);

        {
            PassBoth instance("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");
            benchmark::DoNotOptimize(instance);
        }
    }
}

BENCHMARK(BM_PassRvalueToStructOfferingBoth);

static void BM_PassLValueToStructOfferingBoth(benchmark::State& state)
{
    for (auto _ : state)
    {
        static_cast<void>(_);

        {
            const std::string dummy_string = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
            PassBoth          instance(dummy_string);
            benchmark::DoNotOptimize(instance);
        }
    }
}

BENCHMARK(BM_PassLValueToStructOfferingBoth);

Solution

  • I don't want to simply state "I can't reproduce your issue", but are you certain your measurements are correct and not caused by some quickbench's hiccup?

    I have re-run your code multiple times and I keep getting the following results, which seem intuitive:

    enter image description here

    C++20 seems generally faster, but the proportions remain the same. I can try locally with gprof or perf, maybe that will shed some more light on the problem.

    In of one of the comments you've mentioned:

    The static code analysis tool Helix QACPP proposes the opposite (use const references), if the function parameter exceeds a certain maximum size

    but unfortunately the link is down.

    Limiting the size makes sense though: considering some large object, moving it down the call stack might involve lots of move operations on each individual member, while passing a reference can possibly be optimized to passing a single pointer (to the object itself) and a single move/copy of its multiple members when the actual construction/assignment happens.

    Is that better/worse? Performance-wise: I don't know, measure for your case. It definitely has chances of being more efficient, but keep in mind that compilers are good at optimizing. EDIT: I did quickly measure it and it turns out that there are no significant differences. This might be my trivial to optimize example though. Again, measure, measure, measure.

    #include <map>
    #include <utility>
    #include <array>
    
    
    struct S
    {
        std::array<double, 20000> a;
    };
    
    struct ReallyLarge
    {
        S s1;
        std::map<char, int> m1{};
        std::map<char, long double> m2{};
        std::map<char, int> m3{};
        std::multimap<char, int> m4{};
        std::multimap<int, double> m5{};
        std::map<char, int> m6{};
        std::map<char, long double> m7{};
        std::map<char, int> m8{};
        std::multimap<char, int> m9{};
        std::multimap<int, double> m10{};
        std::map<char, int> m11{};
        std::map<char, long double> m12{};
        std::map<char, int> m13{};
        std::multimap<char, int> m14{};
        std::multimap<int, double> m15{};
        std::map<char, int> m16{};
        std::map<char, long double> m17{};
        std::map<char, int> m18{};
        std::multimap<char, int> m19{};
        std::multimap<int, double> m20{};
        S s2;
    };
    
    
    void f3v(ReallyLarge mm)
    {
        auto x = std::move(mm);
    }
    
    void f2v(ReallyLarge mm)
    {
        f3v(std::move(mm));
    }
    
    void f1v(ReallyLarge mm)
    {
        f2v(std::move(mm));
    }
    
    void f0v(ReallyLarge mm)
    {
        f1v(std::move(mm));
    }
    
    
    void f3r(ReallyLarge&& mm)
    {
        auto x = std::move(mm);
    }
    
    void f2r(ReallyLarge&& mm)
    {
        f3v(std::move(mm));
    }
    
    void f1r(ReallyLarge&& mm)
    {
        f2v(std::move(mm));
    }
    
    void f0r(ReallyLarge&& mm)
    {
        f1v(std::move(mm));
    }
    
    
    
    static void ByMoveVal(benchmark::State& state) {
      for (auto _ : state) {
        f0v(ReallyLarge{});
      }
    }
    BENCHMARK(ByMoveVal);
    
    static void ByMoveRef(benchmark::State& state) {
      for (auto _ : state) {
        f0r(ReallyLarge{});
      }
    }
    BENCHMARK(ByMoveRef);
    
    

    https://quick-bench.com/q/Mje6qd2wWwW0jFtjfxOd7ajKvO4 enter image description here

    If performance is not critical, the main significant difference between the two is exception safety. For references, it's possible to catch the exception inside the caller while the original passed parameter remains intact. If moving into a value and the called function throws, the original object is lost on caller's end.

    EDITED: Follow-up after re-running the original benchmark many times: it's highly inconsistent, also on my local machine. I have no clue at this stage what it's related to, I'd probably guess memory allocation, but that's just guess.

    However, I have also performed one more test: with the string to copy from moved out from the measuring loop. I thought this would be more fair, as it eliminates one extra string construction from the equation, and the aim of the measurement is to learn the speed of move/copy construction. On my Apple M2 I've gotten this for both C++17 and C++20 with O3:

    ----------------------------------------------------------------------------
    Benchmark                                  Time             CPU   Iterations
    ----------------------------------------------------------------------------
    BM_PassRvalueByReferenceAndCopy         49.4 ns         49.4 ns    850220329
    BM_PassRvalueByValueAndMove             21.8 ns         21.8 ns   1000000000
    BM_PassLvalueByReferenceAndCopy         28.5 ns         28.5 ns   1000000000
    BM_PassLvalueByValueAndMove             28.1 ns         28.1 ns   1000000000
    BM_PassRvalueToStructOfferingBoth       21.8 ns         21.8 ns   1000000000
    BM_PassLValueToStructOfferingBoth       28.4 ns         28.4 ns   1000000000
    

    which is somewhat similar to what QuickBench gives me pretty consistently (Clang and libc++ to reflect setup I have at home): https://quick-bench.com/q/wgOUAodUb-iuS6_CnYgXt_rHbBI enter image description here

    
    #include <benchmark/benchmark.h>
    
    #include <string>
    #include <utility>
    
    struct PassByValueAndMove
    {
        explicit PassByValueAndMove(std::string string)
            : m_string(std::move(string))
        {
        }
    
        std::string m_string;
    };
    
    struct PassByReferenceAndCopy
    {
        explicit PassByReferenceAndCopy(const std::string& string)
            : m_string(string)
        {
        }
    
        std::string m_string;
    };
    
    struct PassBoth
    {
        explicit PassBoth(std::string&& string)
            : m_string(std::move(string))
        {
        }
    
        explicit PassBoth(const std::string& string)
            : m_string(string)
        {
        }
    
        std::string m_string;
    };
    
    static void BM_PassRvalueByReferenceAndCopy(benchmark::State& state)
    {
        for (auto _ : state)
        {
            static_cast<void>(_);
    
            {
                PassByReferenceAndCopy instance("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");
                benchmark::DoNotOptimize(instance);
            }
        }
    }
    
    BENCHMARK(BM_PassRvalueByReferenceAndCopy);
    
    static void BM_PassRvalueByValueAndMove(benchmark::State& state)
    {
        for (auto _ : state)
        {
            static_cast<void>(_);
    
            {
                PassByValueAndMove instance("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");
                benchmark::DoNotOptimize(instance);
            }
        }
    }
    
    BENCHMARK(BM_PassRvalueByValueAndMove);
    
    static void BM_PassLvalueByReferenceAndCopy(benchmark::State& state)
    {
        const std::string      dummy_string = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
    
        for (auto _ : state)
        {
            static_cast<void>(_);
    
            {
                PassByReferenceAndCopy instance(dummy_string);
                benchmark::DoNotOptimize(instance);
            }
        }
    }
    
    BENCHMARK(BM_PassLvalueByReferenceAndCopy);
    
    static void BM_PassLvalueByValueAndMove(benchmark::State& state)
    {
        const std::string  dummy_string = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
    
        for (auto _ : state)
        {
            static_cast<void>(_);
    
            {
                PassByValueAndMove instance(dummy_string);
                benchmark::DoNotOptimize(instance);
            }
        }
    }
    
    BENCHMARK(BM_PassLvalueByValueAndMove);
    
    static void BM_PassRvalueToStructOfferingBoth(benchmark::State& state)
    {
        for (auto _ : state)
        {
            static_cast<void>(_);
    
            {
                PassBoth instance("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");
                benchmark::DoNotOptimize(instance);
            }
        }
    }
    
    BENCHMARK(BM_PassRvalueToStructOfferingBoth);
    
    static void BM_PassLValueToStructOfferingBoth(benchmark::State& state)
    {
        const std::string dummy_string = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
    
        for (auto _ : state)
        {
            static_cast<void>(_);
    
            {
                PassBoth          instance(dummy_string);
                benchmark::DoNotOptimize(instance);
            }
        }
    }
    
    BENCHMARK(BM_PassLValueToStructOfferingBoth);