Search code examples
cmultidimensional-arraymallocdelimiter

Function that divides the string with given delimiter


I have function named ft_split(char const *s, char c) that is supposed to take strings and delimiter char c and divide s into bunch of smaller strings.

It is 3rd or 4th day I am trying to solve it and my approach:

  1. Calculates no. of characters in the string including 1 delimiter at the time (if space is delimiter so if there are 2 or more spaces in a row than it counts one space and not more. Why? That space is a memory for adding '\0' at the end of each splitted string)

  2. It finds size (k) of characters between delimiters -> malloc memory -> copy from string to malloc -> copy from malloc to malloc ->start over.

But well... function shows segmentation fault. Debugger shows that after allocating "big" memory it does not go inside while loop, but straight to big[y][z] = small[z] after what it exits the function.

Any tips appreciated.

#include "libft.h"
#include <stdlib.h>
int ft_count(char const *s, char c)
{
    int i;
    int j;

    i = 0;
    j = 0;
    while (s[i] != '\0')
    {
        i++;
        if (s[i] == c)
        {
            i++;
            while (s[i] == c)
            {
                i++;
                j++;
            }
        }
    }
    return (i - j);
}
char **ft_split(char const *s, char c)
{
    int i;
    int k;
    int y;
    int z;
    char *small;
    char **big;

    i = 0;
    y = 0;
    if (!(big = (char **)malloc((ft_count(s, c) + 1) * sizeof(char))))
        return (0);
    while (s[i] != '\0')
    {
        while (s[i] == c)
            i++;
        k = 0;
        while (s[i] != c)
        {
            i++;
            k++;
        }
        if (!(small = (char *)malloc(k * sizeof(char) + 1)))
            return (0);
        z = 0;
        while (z < k)
        {
            small[z] = s[i - k + z];
            z++;
        }
        small[k] = '\0';
        z = 0;
        while (z < k)
        {
            big[y][z] = small[z];
            z++;
        }
        y++;
        free(small);
    }
    big[y][i] = '\0';
    return (big);
}
int main()
{
    char a[] = "jestemzzbogiemzalfa";
    ft_split(a, 'z');
}

Solution

  • There are multiple problems in the code:

    • the ft_count() function is incorrect: you increment i before testing for separators, hence the number is incorrect if the string starts with separators. You should instead count the number of transitions from separator to non-separator:
    int ft_count(char const *s, char c)
    {
        char last;
        int i;
        int j;
    
        last = c;
        i = 0;
        j = 0;
        while (s[i] != '\0')
        {
            if (last == c && s[i] != c)
            {
                j++;
            }
            last = s[i];
            i++;
        }
        return j;
    }
    

    Furthermore, the ft_split() functions is incorrect too:

    • the amount of memory allocated for the big array of pointers in invalid: you should multiply the number of elements by the element size, which is not char but char *.
    • you add an empty string at the end of the array if the string ends with separators. You should test for a null byte after skipping the separators.
    • you do not test for the null terminator when scanning for the separator after the item.
    • you do not store the small pointer into the big array of pointers. Instead of copying the string to big[y][...], you should just set big[y] = small and not free(small).

    Here is a modified version:

    char **ft_split(char const *s, char c)
    {
        int i;
        int k;
        int y;
        int z;
        char *small;
        char **big;
    
        if (!(big = (char **)malloc((ft_count(s, c) + 1) * sizeof(*big))))
            return (0);
        i = 0;
        y = 0;
        while (42)  // aka 42 for ever :)
        {
            while (s[i] == c)
                i++;
            if (s[i] == '\0')
                break;
            k = 0;
            while (s[i + k] != '\0' && s[i + k] != c)
            {
                k++;
            }
            if (!(small = (char *)malloc((k + 1) * sizeof(char))))
                return (0);
            z = 0;
            while (z < k)
            {
                small[z] = s[i];
                z++;
                i++;
            }
            small[k] = '\0';
            big[y] = small;
            y++;
        }
        big[y] = NULL;
        return (big);
    }
    

    42 rant:

    Ces conventions de codage (la norminette) sont contre-productives! Les boucles for sont plus lisibles et plus sûres que ces while, les casts sur les valeurs de retour de malloc() sont inutiles et confusantes, les parenthèses autour de l'argument de return sont infantiles.