Search code examples
c++stringforeachc++11string-literals

Inconsistency between std::string and string literals


I have discovered a disturbing inconsistency between std::string and string literals in C++0x:

#include <iostream>
#include <string>

int main()
{
    int i = 0;
    for (auto e : "hello")
        ++i;
    std::cout << "Number of elements: " << i << '\n';

    i = 0;
    for (auto e : std::string("hello"))
        ++i;
    std::cout << "Number of elements: " << i << '\n';

    return 0;
}

The output is:

Number of elements: 6
Number of elements: 5

I understand the mechanics of why this is happening: the string literal is really an array of characters that includes the null character, and when the range-based for loop calls std::end() on the character array, it gets a pointer past the end of the array; since the null character is part of the array, it thus gets a pointer past the null character.

However, I think this is very undesirable: surely std::string and string literals should behave the same when it comes to properties as basic as their length?

Is there a way to resolve this inconsistency? For example, can std::begin() and std::end() be overloaded for character arrays so that the range they delimit does not include the terminating null character? If so, why was this not done?

EDIT: To justify my indignation a bit more to those who have said that I'm just suffering the consequences of using C-style strings which are a "legacy feature", consider code like the following:

template <typename Range>
void f(Range&& r)
{
    for (auto e : r)
    {
        ...
    }
}

Would you expect f("hello") and f(std::string("hello")) to do something different?


Solution

  • The inconsistency can be resolved using another tool in C++0x's toolbox: user-defined literals. Using an appropriately-defined user-defined literal:

    std::string operator""s(const char* p, size_t n)
    {
        return string(p, n);
    }
    

    We'll be able to write:

    int i = 0;     
    for (auto e : "hello"s)         
        ++i;     
    std::cout << "Number of elements: " << i << '\n';
    

    Which now outputs the expected number:

    Number of elements: 5
    

    With these new std::string literals, there is arguably no more reason to use C-style string literals, ever.