Search code examples
c++delimiteristringstream

stringstream with multiple delimiters


This is another question that I can't seem to find an answer to because every example I can find uses vectors and my teacher won't let us use vectors for this class.

I need to read in a plain text version of a book one word at a time using (any number of) blank spaces
' ' and (any number of) non-letter character's as delimiters; so any spaces or punctuation in any amount needs to separate words. Here's how I did it when it was only necessary to use blank spaces as a delimiter:

while(getline(inFile, line)) {
    istringstream iss(line);

    while (iss >> word) {
        table1.addItem(word);
    }
}

EDIT: An example of text read in, and how I need to separate it.

"If they had known;; you wished it, the entertainment.would have"

Here's how the first line would need to be separated:

If

they

had

known

you

wished

it

the

entertainment

would

have

The text will contain at the very least all standard punctuation, but also such things as ellipses ... double dashes -- etc.

As always, thanks in advance.

EDIT:

So using a second stringstream would look something like this?

while(getline(inFile, line)) {
    istringstream iss(line);

    while (iss >> word) {
        istringstream iss2(word);

        while(iss2 >> letter)  {
            if(!isalpha(letter))
                // do something?
        }
        // do something else?
        table1.addItem(word);
    }
}

Solution

  • I haven't tested this, as I do not have a g++ compiler in front of me now, but it should work (aside from minor C++ syntactic errors)

    while (getline(inFile, line))
    {
        istringstream iss(line);
    
        while (iss >> word)
        {
            // check that word has only alpha-numeric characters
            word.erase(std::remove_if(word.begin(), word.end(), 
                                      [](char& c){return !isalnum(c);}),
                       word.end());
            if (word != "")
                table1.addItem(word);
        }
    }