Search code examples
c++c++11iteratorc++98const-iterator

Iterator invalidation by `std::string::begin()`/`std::string::end()`?


#include <string>
#include <iostream>

int main() {
    std::string s = "abcdef";

    std::string s2 = s;

    auto begin = const_cast<std::string const &>(s2).begin();
    auto end = s2.end();

    std::cout << end - begin << '\n';
}

This code mixes the result of begin() const with the result of end(). Neither of these functions is permitted to invalidate any iterators. However I'm curious whether the requirement of end() to not invalidate the iterator variable begin actually means that the variable begin is usable with end.

Consider a C++98, copy-on-write implementation of std::string; the non-const begin() and end() functions cause a the internal buffer to be copied because the result of these functions can be used to modify the string. So begin above starts out valid for both s and s2, but the use of the non-const end() member causes it to no longer be valid for s2, the container that produced it.

The above code produces 'unexpected' results with a copy-on-write implementation, such as libstdc++. Instead of end - begin being the same as s2.size(), libstdc++ produces another number.

  • Does causing begin to no longer be valid iterator into s2, the container it was retrieved from, constitute 'invalidating' the iterator? If you look at the requirements on iterators, they all appear to hold for this iterator after .end() is called, so perhaps begin still qualifies as a valid iterator, and thus has not been invalidated?

  • Is the above code well defined in C++98? In C++11, which prohibits copy-on-write implementations?

From my own brief reading of the specs, it appears under-specified, so that there may not be any guarantee that the results of begin() and end() can ever be used together, even without mixing const and non-const versions.


Solution

  • As you say, C++11 differs from earlier versions in this regard. There's no problem in C++11 because all attempts to allow copy on write were removed. In pre-C++11, your code results in undefined behavior; the call s2.end() is allowed to invalidate existing iterators (and did, and maybe still does, in g++).

    Note that even if s2 were not a copy, the standard would allow it to invalidate iterators. In fact, the CD for C++98 even made things like f( s.begin(), s.end() ) or s[i] == s[j] undefined behavior. This was only realized at the last minute, and corrected so that only the first call to begin(), end() or [] could invalidate the iterators.