The function tokenize below is intended to set *size to 0 if sprt doesnt exist within str - as such if sprt points to "|" and str to "D AO D", chunk[1] is supposed to point to a NULL pointer and n to be set to 0:
void
tokenize(char *str,
const char *sprt /*separator*/,
char **buffer,
int *size /*tokens length*/)
{
char *chunk[2] = {NULL, NULL};
//store str value into chunk[0]
chunk[0] = calloc(strlen(str)+1, sizeof(char));
strcpy(chunk[0], str);
if (buffer!=NULL)
{
int sz = 0;
chunk[1] = strtok(str, sprt);
while (chunk[1]!=NULL)
{
buffer[sz] = calloc(strlen(chunk[1])+1, sizeof(char));
strcpy(buffer[sz], chunk[1]);
chunk[1] = strtok(NULL, sprt);
sz++;
}
}
else
{
*size=0;
//if chunk is not NULL, the iteration begins => size > 0
chunk[1] = strtok(str, sprt);
while (chunk[1]!=NULL)
{
(*size)++;
chunk[1] = strtok(NULL, sprt);
}
printf("size=%i\n", *size);
}
//restore str value from chunk[0]
strcpy(str, chunk[0]);
if (chunk[0]!=NULL) free(chunk[0]);
if (chunk[1]!=NULL) free(chunk[1]);
}
However when testing the function within the following code, bug: n really needs to be 0!
gets displayed, which means that strtok
didn't work as I expected:
int main()
{
char *test = calloc(7, sizeof(char));
strcpy(test, "D AO D");
int n;
tokenize(test, "|", NULL, &n);
if (n>0)
printf("bug: n really needs to be 0!\n");
else
printf("no bug\n");
}
I don't really know what caused this UB. What I'm doing wrong?
The first strtok
call returns a pointer to the original string "D AO D"
, since there is no "|"
delimiter in this string:
chunk[1] = strtok(str, sprt);
Then the while
loop condition passes, since chunk[1]
is a non-NULL pointer:
while (chunk[1]!=NULL)
{
(*size)++;
chunk[1] = strtok(NULL, sprt);
}
and *size
is incremented in the first iteration. The next strtok
call returns NULL
as the terminating '\0'
byte is encountered, and the loop is terminated due to unmet condition. Thus, *size
becomes equal to 1
, and this is expected behaviour.