Search code examples
c++c++11exceptioncompiler-specific

Using size_type as iterator offset - difference between GCC/Clang and MSVC


I'm trying to use size_type as offset to string iterator in std::copy() algorithm. When size_type is std::string::npos, GCC/Clang doesn't throw any exception but MSCV throws cannot seek string iterator before begin.

Why MSCV throws this exception? Which compiler is correct in this case?

I've tested the program using following compiler on compiler explorer:

GCC - x86-64 gcc 14.2
Clang - x86-64 clang 19.1.0
MSCV - x86 msvc v19.latest

Here is a minimal reproducible program

compiler Explorer

#include <iostream>
#include <algorithm>
#include <iterator>
#include <cstdint>
#include <string>

// timestamp in hhmmss.zzz format; can be in hhmmss format also
void test(const std::string& timestamp)
{
    std::string timeWithoutMs{};
    std::int32_t milliseconds{};

    auto itr = timestamp.find("."); 
    if( itr != std::string::npos)
    {
        milliseconds = std::stoi( timestamp.substr(itr + 1, timestamp.length() - itr));
    }

    std::copy( timestamp.begin(), timestamp.begin() + itr, std::back_inserter(timeWithoutMs));
    std::cout << timeWithoutMs << " - " << milliseconds << std::endl;
}

int main()
{
    test("135523.495");  
    test("135523");      // MSVC throws exception here
}

Solution

  • The definition of std::string::npos is -1. As such, when itr is std::string::npos, that std::copy call looks like:

    std::copy( timestamp.begin(), timestamp.begin() + -1, std::back_inserter(timeWithoutMs));
    

    Note that in the second argument, you attempt to decrement the begin iterator of the string. MSVC's standard library chooses to diagnose this, as by default it has "checked iterators", which ensure the iterators are within bounds. GCC and Clang aren't required to diagnose this, so their behaviour is still correct, but your program is invalid, as you try to call std::copy with an invalid iterator range.

    You can instead use std::string::substr, which is well-defined when the count argument is std::string::npos, and in this case runs up to the end of the string:

    timeWithoutMs = timestamp.substr(0, itr);
    

    I would also rename the itr variable to idx, since std::string::find returns an index rather than an iterator, and I think this adds to your confusion.