Search code examples
carraysstringpointersrealloc

How do I create a function in C that allows me to split a string based on a delimiter into an array?


I want to create a function in C, so that I can pass the function a string, and a delimiter, and it will return to me an array with the parts of the string split up based on the delimiter. Commonly used to separate a sentence into words.

e.g.: "hello world foo" -> ["hello", "world", "foo"]

However, I'm new to C and a lot of the pointer things are confusing me. I got an answer mostly from this question, but it does it inline, so when I try to separate it into a function the logistics of the pointers are confusing me:

This is what I have so far:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void split_string(char string[], char *delimiter, char ***result) {
    char *p = strtok(string, delimiter);
    int i, num_spaces = 0;

    while (p != NULL) {
        num_spaces++;
        &result = realloc(&result, sizeof(char *) * num_spaces);

        if (&result == NULL) {
            printf("Memory reallocation failed.");
            exit(-1);
        }

        &result[num_spaces - 1] = p;

        p = strtok(NULL, " ");
    }

    // Add the null pointer to the end of our array
    &result = realloc(split_array, sizeof(char *) * num_spaces + 1);
    &result[num_spaces] = 0;

    for (i = 0; i < num_spaces; i++) {
        printf("%s\n", &result[i]);
    }

    free(&result);
} 

int main(int argc, char *argv[]) {
    char str[] = "hello world 1 foo";
    char **split_array = NULL;

    split_string(str, " ", &split_array);

    return 0;
}

The gist of it being that I have a function that accepts a string, accepts a delimiter and accepts a pointer to where to save the result. Then it constructs the result. The variable for the result starts out as NULL and without memory, but I gradually reallocate memory for it as needed.

But I'm really confused as to the pointers, like I said. I know my result is of type char ** as a string it of type char * and there are many of them so you need pointers to each, but then I'm supposed to pass the location of that char ** to the new function, right, so it becomes a char ***? When I try to access it with & though it doesn't seem to like it.

I feel like I'm missing something fundamental here, I'd really appreciate insight into what is going wrong with the code.


Solution

  • You confusing dereferencing with addressing (which is the complete opposite). Btw, I couldn't find split_array anywhere in the function, as it was down in main. Even if you had the dereferencing and addressing correct, this would still have other issues.

    I'm fairly sure you're trying to do this:

    #include <stdio.h>
    #include <stdlib.h>
    
    void split_string(char string[], const char *delimiter, char ***result)
    {
        char *p = strtok(string, delimiter);
        void *tmp = NULL;
        int count=0;
    
        *result = NULL;
    
        while (p != NULL)
        {
            tmp = realloc(*result, (count+1)*sizeof **result);
            if (tmp)
            {
                *result = tmp;
                (*result)[count++] = p;
            }
            else
            {   // failed to expand
                perror("Failed to expand result array");
                exit(EXIT_FAILURE);
            }
    
            p = strtok(NULL, delimiter);
        }
    
        // add null pointer
        tmp = realloc(*result, (count+1)*sizeof(**result));
        if (tmp)
        {
            *result = tmp;
            (*result)[count] = NULL;
        }
        else
        {
            perror("Failed to expand result array");
            exit(EXIT_FAILURE);
        }
    }
    
    int main()
    {
        char str[] = "hello world 1 foo", **toks = NULL;
        char **it;
    
        split_string(str, " ", &toks);
    
        for (it = toks; *it; ++it)
            printf("%s\n", *it);
        free(toks);
    }
    

    Output

    hello
    world
    1
    foo
    

    Honestly this would be cleaner if the function result were utilized rather than an in/out parameter, but you choice the latter, so there you go.

    Best of luck.