Search code examples
c++streamgetlinestringstreamistringstream

C++ getline()'s undocumented behavior


In C++ when you use getline() with delimiter on stringstream there are few things that I didn't found documented, but they have some non-error handy behaviour when:

  • delimiter is not found => then simply whole string/rest of it is returned
  • there is delimiter but nothing before it => empty string is returned
  • getting something that isn't really there => returns the last thing that could be read with it

Some test code (simplified):

#include <iostream>
#include <string>
#include <sstream>
using namespace std;

string test(const string &s, char delim, int parseIndex ){
    stringstream ss(s);
    string parsedStr = "";
    
    for( int i = 0; i < (parseIndex+1); i++ ) getline(ss, parsedStr, delim);
    
    return parsedStr;
}

int main() {
    stringstream ss("something without delimiter");
    string s1;
    getline(ss,s1,';');
    cout << "'" << s1  << "'" << endl; //no delim
    cout << endl;
    
    string s2 = "321;;123";
    cout << "'" << test(s2,';',0) << "'" << endl; //classic
    cout << "'" << test(s2,';',1) << "'" << endl; //nothing before
    cout << "'" << test(s2,';',2) << "'" << endl; //no delim at the end
    cout << "'" << test(s2,';',3) << "'" << endl; //this shouldn't be there
    cout << endl;
    
    return 0;
}

Test code output:

'something without delimiter'

'321'
''
'123'
'123'

Test code fiddle: http://ideone.com/ZAuydR

The Question

The question is - can this be relied on? If so, where is it documented - is it?

Thanks for answers and clarifying :)


Solution

  • The behavior of C++ facilities is described by the ISO C++ standard. But, it's not the most readable resource. In this case, cppreference.com has good coverage.

    Here's what they have to say. The quote blocks are copy-pasted; I've interspersed explanations to your questions.

    Behaves as UnformattedInputFunction, except that input.gcount() is not affected. After constructing and checking the sentry object, performs the following:

    "Constructing and checking the sentry" means that if an error condition has been detected on the stream, the function will return without doing anything. This is why in #3 you observe the last valid input when "nothing should be there."

    1) Calls str.erase()

    So, if nothing is subsequently found before the delimiter, you'll get an empty string.

    2) Extracts characters from input and appends them to str until one of the following occurs (checked in the order listed)

    a) end-of-file condition on input, in which case, getline sets eofbit.

    This is an error condition which causes the string local variable to be unchanged by subsequent getlines.

    It also allows you to observe the last segment of input before the end, so you may treat the end-of-file as a delimiter if you wish.

    b) the next available input character is delim, as tested by Traits::eq(c, delim), in which case the delimiter character is extracted from input, but is not appended to str.

    c) str.max_size() characters have been stored, in which case getline sets failbit and returns.

    3) If no characters were extracted for whatever reason (not even the discarded delimiter), getline sets failbit and returns.