Search code examples
c++stringwhitespace

Remove extra whitespace from c++ issue


I have this code snippet from online.


    void ShortenSpace(string &s)
    {
        // n is length of the original string
        int n = s.length();
     
        //pointer i to keep trackof next position and j to traverse
        int i = 0, j = -1;
     
        // flag that sets to true is space is found
        bool spaceFound = false;
     
        // Handles leading spaces
        while (++j < n && s[j] == ' ');
     
        // read all characters of original string
        while (j < n)
        {
            // if current characters is non-space
            if (s[j] != ' ')
            {
                //if any preceeding space before ,.and ?
                if ((s[j] == '.' || s[j] == ',' ||
                     s[j] == '?') && i - 1 >= 0 &&
                     s[i - 1] == ' ')
                    s[i - 1] = s[j++];
     
                else
                    // copy current character to index i
                    // and increment both i and j
                    s[i++] = s[j++];
     
                // set space flag to false when any
                // non-space character is found
                spaceFound = false;
            }
            // if current character is a space
            else if (s[j++] == ' ')
            {
                // If space is seen first time after a word
                if (!spaceFound)
                {
                    s[i++] = ' ';
                    spaceFound = true;
                }
            }
        }
     
        // Remove trailing spaces
        if (i <= 1)
            s.erase(s.begin() + i, s.end());
        else
            s.erase(s.begin() + i - 1, s.end());
    }

The problem is if the input is: "test (multiple spaces) test (multiple spaces) test."

It will remove the last period and put output like "test test test"

It removes the whitespace correctly but somehow it is mishandling/removing the punctuation. I do not want it to remove the punctuation. I'm still beginner in C++ so I am having a hard time figuring out why.


Solution

  • Because it indiscriminately deletes the last character.

    The last conditional should check if the last character is a white space as well:

    // Trim string to result
    if (i <= 1 || s[i-1] != ' ')
        s.erase(s.begin() + i, s.end());
    else
        s.erase(s.begin() + i - 1, s.end());
    

    I corrected the comment as well, as it does not trim the trailing white spaces, but the trailing characters that are left over after the manipulation. This algorithm clears characters it shifts ahead. If you were to leave out this last conditional, the output would be: test test test. test. for input test test test.