Performance of repetitive indirection

I find myself debating whether I want to write like Code 1 vs Code 2. In my opinion, Code 1 looks cleaner, but in theory, can I expect a performance penalty due to its extra indirections compared to Code 2? Are there any relevant compiler optimizations here? Does anything change if bar() returns a Bar*?

Code 1:

foo.bar().method1();
foo.bar().method2();
foo.bar().method3();
foo.bar().method4();

Code 2:

Bar& bar = foo.bar(); //Java programmers: ignore ampersand
bar.method1();
bar.method2();
bar.method3();
bar.method4();

EDIT: I think there are too many variables to ask such a general question (e.g. const vs non-const methods, whether the compiler inlines the methods, how the compiler treats the references etc). Analyzing my specific code in assembly is perhaps the way to go.

Solution

Reference tests

I ran a simple test. When compiled with no optimizations, on my machine Test_1 took 1272 ms and Test_2 1108 (I ran the tests several times, results within a couple of ms). With O2/O3 optimizations, both tests appeared to take an equal amount of time: 946 ms.

    #include <iostream>
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    #include <chrono>

    using namespace std;

    class Foo
    {
    public:
      Foo() : x_(0) {}
      void add(unsigned amt)
      {
        x_ += amt;
      }
      unsigned x_;
    };

    class Bar
    {
    public:
      Foo& get()
      {
        return foo_;
      }
    private:
      Foo foo_;
    };

    int main()
    {
      srand(time(NULL));
      Bar bar;
      constexpr int N = 100000000;
      //Foo& foo = bar.get(); //TEST_2
      auto start_time = chrono::high_resolution_clock::now();
      for (int i = 0; i < N; ++i)
      {
        bar.get().add(rand()); //TEST_1
        //foo.add(rand()); //TEST_2
      }
      auto end_time = chrono::high_resolution_clock::now();

      cout << bar.get().x_ << endl;
      cout << "Time: ";
      cout << chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count() << endl;
    }

Pointer tests

I reran the tests, but this time with the class member being a pointer. When compiled with no optimizations, on my machine Test_3 took 1285-1340 ms and Test_4 1110 ms. With O2/O3 optimizations, both tests appeared to take an equal amount of time: 915 ms (surprisingly, less time than the reference tests above).

#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <chrono>

using namespace std;

class Foo
{
public:
  Foo() : x_(0) {}
  void add(unsigned amt)
  {
    x_ += amt;
  }
  unsigned x_;
};

class Bar
{
public:
  ~Bar()
  {
    delete foo_;
  }
  Foo* get()
  {
    return foo_;
  }
private:
  Foo* foo_ = new Foo;
};

int main()
{
  srand(time(NULL));
  Bar bar;
  constexpr int N = 100000000;
  //Foo* foo = bar.get(); //TEST_4
  auto start_time = chrono::high_resolution_clock::now();
  for (int i = 0; i < N; ++i)
  {
    bar.get()->add(rand()); //TEST_3
    //foo->add(rand()); //TEST_4
  }
  auto end_time = chrono::high_resolution_clock::now();

  cout << bar.get()->x_ << endl;
  cout << "C++ Time: ";
  cout << chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count() << endl;
}

Conclusion

According to these simple tests on my machine, Code 2 style is slightly faster by an order of around ~15% when optimizations are not enabled, but with optimizations enabled, there is no difference in performance.