Search code examples
cstringtokenizestrtok

Why does strtok() not tokenize the string a certain way?


I'm trying to tokenize a string using brackets [] as delimiters. I can tokenize a string exactly how I want it with one input, but it has an error other times. For example, I have a string with characters before the delimiter and it works fine, but if nothing is before the delimiter then I run into errors.

This one gives me an error. The token2 ends up being NULL and token is "name]" with the bracket still on there.

char name[] = "[name]";
char *token = strtok(name, "[");
char *token2 = strtok(NULL, "]");

Output:

token = name]
token2 = NULL

However, if I have the following, then it works just fine.

char line[] = "Hello [name]";
char *tok = strtok(line, "[");
char *tok2 = strtok(NULL, "]");

Output:

tok = Hello
tok2 = name

I don't understand what I'm doing wrong when the input is simply something like "[name]". I want just what's inside the brackets only.

Edit: Thanks for the input, everyone. I found a solution to what I'm trying to do. Per @Ryan and @StoryTeller's advice, I first checked if the input began with [ and delimited with []. Here's what I tried and worked for the input:

char name[] = "[name]", *token = NULL, *token2 = NULL;

if (name[0] == '[')
{
    token = strtok(name, "[]");
}
else
{
    token = strtok(name, "[");
    token2 = strtok(NULL, "]");
}

Solution

  • In short: the 2nd time you called strtok() in your first example is the same as calling it on an empty string and this is why you get NULL.

    Each call to strtok gives you the token based on your chosen delimiter. In your 1st try:

    char name[] = "[name]";
    char *token = strtok(name, "[");
    char *token2 = strtok(NULL, "]");
    

    The delimiter you chose is "[" so the 1st call to strtok will get "name]", since this is the first token in the string (remember that the string starts with a delimiter). The second will get NULL, since "name]" was the end of your original string and invoking strotk() now is like invoking it on an empty string.

    strtok() uses a static buffer that holds your original string and each invocation "uses" another part of that buffer. After your 1st call, the function "used" the entire buffer.

    In your 2nd try:

    char line[] = "Hello [name]";
    char *tok = strtok(line, "[");
    char *tok2 = strtok(NULL, "]");
    

    You call strtok on a string with the delimiter in the middle of it, so you get a token AND you still have a string left in the static buffer used by the function. That enables the 2nd call of strtok() to return a valid token instead of NULL.