Search code examples
cstrtok

So strtok is destructive?


Do I understand correctly that strtok leaves the source string larded with null characters?

I could understand that each new iteration call first replaces the null-character it put there with the original character and then continues, and that the last call, which returns null because there are no more marches, then replaces the last null character it last put there with its original. As a result, the source string would end-up unmodified. (Of course, would you stop before this last call, the source string will remain modified.)

But no documentation mentions such a strategy. So I must first copy the source string to another buffer before processing with strtok if I want the source string to remain unmodified?


Solution

  • The C standard explicitly states "breaks the string ... into...":

    A sequence of calls to the strtok function breaks the string pointed to by s1 into a sequence of tokens

    Explicitly breaking "the string pointed to by s1" more than implies the original string is modified.

    Note also the synopsis:

    Synopsis

         #include <string.h>
         char *strtok(char * restrict s1,
              const char * restrict s2);
    

    It's char * restrict s1, tellingly lacking any const.

    Note that it's the "sequence of calls" that "breaks" the string. Restoring the string after each token is parsed would not comply with the requirement to "break the string", or that the string be "broken" after the "sequence of calls".

    POSIX makes the modification explicit (bolding mine):

    The strtok() function then searches from there for a byte that is contained in the current separator string. If no such byte is found, the current token extends to the end of the string pointed to by s, and subsequent searches for a token shall return a null pointer. If such a byte is found, it is overwritten by a NUL character, which terminates the current token. The strtok() function saves a pointer to the following byte, from which the next search for a token shall start.