Search code examples
csubstringc-strings

Why does the following code not work as intended


I have a string and I want to split it into words. The code works when the string passed ends with a space, but not otherwise

CODE:

void form_rule2(char * str)
{
    int num_words=0;
    char word[20];
    while(sscanf(str,"%s",word)==1)
    {
        num_words++;
        printf("%s\n",word);

        str+=strlen(word)+1;
    }

}

void main()
{
    form_rule2("abcd efgh! ijkl mnop");
}

The output I get is:

abcd
efgh!
ijkl
mnop
P

Clearly the 'P' is extra, and the case is similar for other test cases as well, giving random characters at the end.

I wanted to understand what is going on here, is the NULL terminating character somehow involved.

I am using gcc compiler on linux with a x86-64 processor, if that matters.


Solution

  • The problem lies here:

    str+=strlen(word)+1;
    

    This skips the current word, and the space that follows it.

    If the last word in your string does not have a trailing space character, then you actually skip past the word and also the trailing '\0' character that terminates the string. You are then parsing whatever junk happens to be after your array.

    Technically you are invoking undefined behaviour by accessing beyond the bounds of the array.

    One possible solution is to check for a whitespace character at the end, and only skip if it is present.

            str+=strlen(word);
            if (isspace(*str))
                ++str;
    

    You'll need to #include <ctype.h>.

    Note that this isn't very robust against things like having multiple spaces in a row, but that's a separate issue.