Search code examples
javac++performancerepeatindirection

Performance of repetitive indirection


I find myself debating whether I want to write like Code 1 vs Code 2. In my opinion, Code 1 looks cleaner, but in theory, can I expect a performance penalty due to its extra indirections compared to Code 2? Are there any relevant compiler optimizations here? Does anything change if bar() returns a Bar*?

Code 1:

foo.bar().method1();
foo.bar().method2();
foo.bar().method3();
foo.bar().method4();

Code 2:

Bar& bar = foo.bar(); //Java programmers: ignore ampersand
bar.method1();
bar.method2();
bar.method3();
bar.method4();

EDIT: I think there are too many variables to ask such a general question (e.g. const vs non-const methods, whether the compiler inlines the methods, how the compiler treats the references etc). Analyzing my specific code in assembly is perhaps the way to go.


Solution

  • Reference tests

    I ran a simple test. When compiled with no optimizations, on my machine Test_1 took 1272 ms and Test_2 1108 (I ran the tests several times, results within a couple of ms). With O2/O3 optimizations, both tests appeared to take an equal amount of time: 946 ms.

        #include <iostream>
        #include <stdio.h>
        #include <stdlib.h>
        #include <time.h>
        #include <chrono>
    
        using namespace std;
    
        class Foo
        {
        public:
          Foo() : x_(0) {}
          void add(unsigned amt)
          {
            x_ += amt;
          }
          unsigned x_;
        };
    
        class Bar
        {
        public:
          Foo& get()
          {
            return foo_;
          }
        private:
          Foo foo_;
        };
    
        int main()
        {
          srand(time(NULL));
          Bar bar;
          constexpr int N = 100000000;
          //Foo& foo = bar.get(); //TEST_2
          auto start_time = chrono::high_resolution_clock::now();
          for (int i = 0; i < N; ++i)
          {
            bar.get().add(rand()); //TEST_1
            //foo.add(rand()); //TEST_2
          }
          auto end_time = chrono::high_resolution_clock::now();
    
          cout << bar.get().x_ << endl;
          cout << "Time: ";
          cout << chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count() << endl;
        }
    

    Pointer tests

    I reran the tests, but this time with the class member being a pointer. When compiled with no optimizations, on my machine Test_3 took 1285-1340 ms and Test_4 1110 ms. With O2/O3 optimizations, both tests appeared to take an equal amount of time: 915 ms (surprisingly, less time than the reference tests above).

    #include <iostream>
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    #include <chrono>
    
    using namespace std;
    
    class Foo
    {
    public:
      Foo() : x_(0) {}
      void add(unsigned amt)
      {
        x_ += amt;
      }
      unsigned x_;
    };
    
    class Bar
    {
    public:
      ~Bar()
      {
        delete foo_;
      }
      Foo* get()
      {
        return foo_;
      }
    private:
      Foo* foo_ = new Foo;
    };
    
    int main()
    {
      srand(time(NULL));
      Bar bar;
      constexpr int N = 100000000;
      //Foo* foo = bar.get(); //TEST_4
      auto start_time = chrono::high_resolution_clock::now();
      for (int i = 0; i < N; ++i)
      {
        bar.get()->add(rand()); //TEST_3
        //foo->add(rand()); //TEST_4
      }
      auto end_time = chrono::high_resolution_clock::now();
    
      cout << bar.get()->x_ << endl;
      cout << "C++ Time: ";
      cout << chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count() << endl;
    }
    

    Conclusion

    According to these simple tests on my machine, Code 2 style is slightly faster by an order of around ~15% when optimizations are not enabled, but with optimizations enabled, there is no difference in performance.