c++performance compiler-optimization stdvector loop-invariant

Is calling std::vector::size() as fast as reading a variable?

I have do an extensive calculation on a big vector of integers. The vector size is not changed during the calculation. The size of the vector is frequently accessed by the code. What is faster in general:

using the vector::size() function or
using a helper constant vectorSize for storing the size of the vector?

I know that compilers usually inline the size() function when setting the proper compiler flags, however, this is not guaranteed.

Solution

Interesting question.

So, what's going to happened ? Well if you debug with gdb you'll see something like 3 member variables (names are not accurate):

_M_begin: pointer to the first element of the dynamic array
_M_end: pointer one past the last element of the dynamic array
_M_capacity: pointer one past the last element that could be stored in the dynamic array

The implementation of vector<T,Alloc>::size() is thus usually reduced to:

return _M_end - _M_begin;  // Note: _Mylast - _Myfirst in VC 2008

Now, there are 2 things to consider when regarding the actual optimizations possible:

will this function be inlined ? Probably: I am no compiler writer, but it's a good bet since the overhead of a function call would dwarf the actual time here and since it's templated we have all the code available in the translation unit
will the result be cached (ie sort of having an unnamed local variable): it could well be, but you won't know unless you disassemble the generated code

In other words:

If you store the size yourself, there is a good chance it will be as fast as the compiler could get it.
If you do not, it will depend on whether the compiler can establish that nothing else is modifying the vector; if not, it cannot cache the variable, and will need to perform memory reads (L1) every time.

It's a micro-optimization. In general, it will be unnoticeable, either because the performance does not matter or because the compiler will perform it regardless. In a critical loop where the compiler does not apply the optimization, it can be a significant improvement.