Search code examples
c++stdstring

Whis is faster for getting a part of the string, std::string::erase or std::string::substr


I am retrieving and storing a part of the string for which I can use either std::string::erase or std::string::substr.

I would like to know which of the following approach is faster (less time to complete) and efficient (less memory allocation/reallocation). Also, any info about how the memory is allocated/reallocated by the erase and substr would be very helpful. Thanks!

std::string nodeName("ABCD#XYZ#NodeName");
const std::string levelSeparator("#");

Option 1: Using std::string::substr

std::string::size_type nodeNameStartPosition = nodeName.rfind(levelSeparator);
if (nodeNameStartPosition != std::string::npos)
{
    nodeNameStartPosition += levelSeparator.length();
    nodeName = nodeName.substr(nodeNameStartPosition);
}

Option 2: Using std::string::erase

std::string::size_type nodeNameStartPosition = nodeName.rfind(levelSeparator);
if (nodeNameStartPosition != std::string::npos)
{
    nodeNameStartPosition += levelSeparator.length();
    nodeName = nodeName.erase(0, nodeNameStartPosition);
}

Solution

  • If you really care, always benchmark.

    You don't need to do a self assignment ala nodeName = nodeName.erase(0, nodeNameStartPosition); - just use:

    nodeName.erase(0, nodeNameStartPosition);
    

    This works because erase already modifies the string nodeName in place.

    Any speed difference is overwhelmingly likely to be in erase's favour, as there's definitely no memory allocation going on - just the copying within the buffer. substr() is likely to create a temporary string - you can tell that from the by-value return type in the std::string::substr function prototype:

    string substr (size_t pos = 0, size_t len = npos) const;
    

    This by-value return may require heap allocation unless short-string optimisation kicks in. I'm sceptical whether optimisers can remove those overheads.

    Separately, nodeNameStartSeparator is clearly a misnomer as you're pointing it at the start of the level separator. It all boils down to:

    std::string::size_type levelSeparatorPos = nodeName.rfind(levelSeparator);
    if (levelSeparatorPos != std::string::npos)
        nodeName.erase(0, levelSeparatorPos + levelSeparator.length());