Search code examples
c++performancecompiler-optimizationstdvectorloop-invariant

Is calling std::vector::size() as fast as reading a variable?


I have do an extensive calculation on a big vector of integers. The vector size is not changed during the calculation. The size of the vector is frequently accessed by the code. What is faster in general:

  • using the vector::size() function or
  • using a helper constant vectorSize for storing the size of the vector?

I know that compilers usually inline the size() function when setting the proper compiler flags, however, this is not guaranteed.


Solution

  • Interesting question.

    So, what's going to happened ? Well if you debug with gdb you'll see something like 3 member variables (names are not accurate):

    • _M_begin: pointer to the first element of the dynamic array
    • _M_end: pointer one past the last element of the dynamic array
    • _M_capacity: pointer one past the last element that could be stored in the dynamic array

    The implementation of vector<T,Alloc>::size() is thus usually reduced to:

    return _M_end - _M_begin;  // Note: _Mylast - _Myfirst in VC 2008
    

    Now, there are 2 things to consider when regarding the actual optimizations possible:

    • will this function be inlined ? Probably: I am no compiler writer, but it's a good bet since the overhead of a function call would dwarf the actual time here and since it's templated we have all the code available in the translation unit
    • will the result be cached (ie sort of having an unnamed local variable): it could well be, but you won't know unless you disassemble the generated code

    In other words:

    • If you store the size yourself, there is a good chance it will be as fast as the compiler could get it.
    • If you do not, it will depend on whether the compiler can establish that nothing else is modifying the vector; if not, it cannot cache the variable, and will need to perform memory reads (L1) every time.

    It's a micro-optimization. In general, it will be unnoticeable, either because the performance does not matter or because the compiler will perform it regardless. In a critical loop where the compiler does not apply the optimization, it can be a significant improvement.