Do I understand correctly that strtok
leaves the source string larded with null characters?
I could understand that each new iteration call first replaces the null-character it put there with the original character and then continues, and that the last call, which returns null because there are no more marches, then replaces the last null character it last put there with its original. As a result, the source string would end-up unmodified. (Of course, would you stop before this last call, the source string will remain modified.)
But no documentation mentions such a strategy. So I must first copy the source string to another buffer before processing with strtok
if I want the source string to remain unmodified?
The C standard explicitly states "breaks the string ... into...":
A sequence of calls to the strtok function breaks the string pointed to by
s1
into a sequence of tokens
Explicitly breaking "the string pointed to by s1
" more than implies the original string is modified.
Synopsis
#include <string.h>
char *strtok(char * restrict s1,
const char * restrict s2);
It's char * restrict s1
, tellingly lacking any const
.
Note that it's the "sequence of calls" that "breaks" the string. Restoring the string after each token is parsed would not comply with the requirement to "break the string", or that the string be "broken" after the "sequence of calls".
POSIX makes the modification explicit (bolding mine):
The
strtok()
function then searches from there for a byte that is contained in the current separator string. If no such byte is found, the current token extends to the end of the string pointed to bys
, and subsequent searches for a token shall return a null pointer. If such a byte is found, it is overwritten by a NUL character, which terminates the current token. Thestrtok()
function saves a pointer to the following byte, from which the next search for a token shall start.