Search code examples
c++operatorsstdstringdouble-quotesnull-character

C++: About null characters


There are two string variables, m and n:

#include <string>

string m = "0100700\0"
cout << m.size() << endl; // it prints: 7

string n;
n += "0100700"
n += '\0';
cout << n.size() << endl; // it prints: 8

I supposed that both had 8 characters, but m had only 7 characters and n had 8 characters. Why is this the case?


Solution

  • The first thing to note is that std::string does not have a constructor that can infer the length of a string literal from the underlying array. What it has is a constructor that accepts a const char* and treats it as a null-terminated string. In doing so, it copies characters until it finds the first \0.

    This is the constructor used in string m = "0100700\0";, which is why in the first case your string is of length 7. Note that there is no other way to get the length of a char array from a pointer to its first element.

    In the second example, you add a character to a pre-existing std::string object of length 7. This increases the length to 8. If you were to iterate over the elements of the string, you would be able to see that this 8th element is '\0'.

    for (auto c: n)
        if (c == 0) std::cout << "null terminator" << std::endl;
    

    In order to initialize a string containing '\0' characters, you have options:

    Use an initialization list:

    std::string s{'a', 'b', '\0', 'd', 'e', '\0', 'g'};
    

    Construct from a different container or array using std::string's iterator constructor:

    std::vector<char> v{'a', 'b', '\0', 'd', 'e', '\0', 'g'};
    char c[] = {'a', 'b', '\0', 'd', 'e', '\0', 'g'};
    const char* ps = "ab\0de\0g";
    
    std::string s0(std::begin(v), std::end(v));
    std::string s1(std::begin(c), std::end(c));
    std::string s2(ps, ps + 8);