Search code examples
stringc++11vectorsplitdelimiter

How can I implement multiple delimiters when splitting a string into a vector?


My professor needs us to split a string with ignoring any punctuation so that "Hello, my name is Jack!" will split without the comma and exclamation point. Specifically we must discard commas, periods, question marks, exclamation points, semicolons and colons.

The code below works but the only delimiter is a space. How do I add more delimiters with what I've got?

Function call:

tokenize(code, ' ', tokens);

The function that splits the string and stores it into a vector:

void tokenize(const string& str, char delim, vector<string>& tokens)
{
    int tokenStart = 0;

    int delimPos = str.find(delim);

    while(delimPos != string::npos)
    {
        string tok = str.substr(tokenStart, delimPos - tokenStart);

        tokens.push_back(tok);

        delimPos++;

        tokenStart = delimPos;

        delimPos = str.find(delim, delimPos);

        if(delimPos == string::npos)
        {
            string tok = str.substr(tokenStart, delimPos - tokenStart);

            tokens.push_back(tok);
        }   
    }
}

Solution

  • string::find_first_of matches any of the characters in the argument, so you could use it instead of string::find like you're doing now. All you have to do then is keep updating pos, something like:

    void tokenize(const string& str, const string& delim, vector<string>& tokens)
    {
        int tokenStart = 0;
    
        int delimPos = str.find_first_of(delim);
    
        while(delimPos != string::npos)
        {
            string tok = str.substr(tokenStart, delimPos - tokenStart);
            ...
            delimPos = str.find_first_of(delim, delimPos);
            ...
        }