Search code examples
cstringmemory-managementtrim

in trim function there's an if-check not working properly, why?


EDIT: I've edited my code, and this is the result:

#include <stdlib.h>
#include <ctype.h>

char *trim(const char *s) {
    if (s == NULL) {
        return NULL;
    }
    size_t count_1 = 0;
    for (size_t i = 0; s[i] != '\0'; i++) {
        count_1++;
    }
    if (count_1 < 1) {
        return NULL; 
    }
    size_t count_2 = 0;  
    if (isspace(s[0])) {
        count_2++;
    }
    if (isspace(s[count_1 - 1])) {
        count_2++;
    }
    size_t max_length = (count_1 - count_2) + 1u;
    if (max_length >= count_1) {
        return NULL;
    }
    char *str = malloc(max_length);
    if (!str) {
        return NULL;
    }
    for (size_t i = 0; s[i] != '\0'; i++) {
        if (isspace(s[i]) == 0) { // if isspace is false. 
            str[i] = s[i];
        }
    }
    str[count_1 - count_2] = 0;
    return str;
}

int main(void) {
    char s[] = " a b ";
    char *str;
    str = trim(s);

    free(str);
    return 0;
}

now, the problem is here

    for (size_t i = 0; s[i] != '\0'; i++) {
        if (isspace(s[i]) == 0) { // if isspace is false. 
            str[i] = s[i];
        }

I have a buffer overrun, even if I've checked the length. In fact, if count_1 is equal to zero, I have a buffer overrun error, but I've excluded this case, but the problem persists. By debugging line-by-line, I've noticed I have an undefined behavior.


I wanted to try to simplify the suggested solution for this exercise, therefore I've written another code for the same exercise.

this is the original answer: trim function halve the memory size to remove the whitespaces?

this is the minimal reproducible code:

#include <stdlib.h>
#include <ctype.h>

char *trim(const char *s) {
    size_t count_1 = 0;
    for (size_t i = 0; s[i] != '\0'; i++) {
        count_1++;
    }
    size_t count_2 = 0;
    if (isspace(s[0])) {
        count_2++;
    }
    if (isspace(s[count_1])) {
        count_2++;
    }
    size_t max_length = (count_1 - count_2) + 1u; 
    if (max_length >= count_1) {
        return NULL; 
    }
    char *str = malloc(max_length); 
    if (!str) {
        return NULL; 
    }
    for (size_t i = 0; s[i] != '\0'; i++) {
        if (isalpha(s[i]) == 0) { // if isalpha is false. 
            str[i] = s[i]; 
        } 
        str[count_1 - count_2] = 0; 
    }
    return str; 
}

int main(void) {
    char s[] = " a b "; 
    char *str; 
    str = trim(s);

    free(str); 
    return 0; 
} 

here's the detailed explanation about what I've done so far:

  • I've counted characters of the string s, and the length is stored in count_1.
  • I've counted how many whitespaces I have at the beginning of the string, and at the end of the string; and the amount is stored in count_2.

note: I've chosen to use isspace function (in <ctype.h>), because I tried to type ' ' (i.e a whitespace), but the result is not correct, and these if-checks are not evaluated whatsoever. (I used the debugger line-by-line to state this thing).

  • before malloc the memory I've used a check condition to avoid buffer overrun (it's similar to the question I asked yesterday), meaning I've allocated enough memory if and only if max_length is less than count_1. doing this way, I have no buffer overrun warning.

I think I can avoid to explain the final steps, because they are self explanatory and I also think they doesn't cause errors. If I'm wrong, I'll edit this point.

issue I have no clue how to fix it:

  • by debugging line-by-line, I've noticed that when the flow of execution goes to the 2nd if-check, the if body is not executed whatsoever. And this is strange, because the first one works fine.

Solution

  • There are multiple problems in your code:

    • count_1 is the length of the string, you should name it more explicitly as len
    • you return NULL if no trimming is needed. This is questionable. You should probably return a copy of the string in all cases and only return NULL in case of allocation failure.
    • you only test for 1 space char at the start of the string.
    • you only test for 1 space char at the end of the string.
    • furthermore this space might be counted twice if the string is " ".
    • max_length is a misnomer: it is not the length of the new string, but the allocation size, new_size seems more appropriate.
    • in the final loop, you use the same index i into the original and the new string: this is incorrect. You should use a separate index so characters from the original string can be copied after skipping the initial space.
    • str[count_1 - count_2] = 0; is redundant inside the loop: you should move this statement after the end of the loop.
    • argument values of type char should be cast as (unsigned char) when passed to the functions and macros defined in <ctype.h> to avoid undefined behavior on negative values on platforms where the char type is signed. These functions are only defined for the values of type unsigned char (between 0 and UCHAR_MAX) and the special negative value EOF. These values are the ones returned by getchar() and getc().

    Here is a modified version:

    #include <ctype.h>
    #include <stdio.h>
    #include <stdlib.h>
    
    char *trim(const char *s) {
        if (s == NULL) {
            return NULL;
        }
        size_t start, end;
        for (start = 0; isspace((unsigned char)s[start]); start++) {
            continue;
        }
        for (end = start; s[end] != '\0'; end++) {
            continue;
        }
        while (end > start && isspace((unsigned char)s[end - 1])) {
            end--;
        }
        // if you are allowed to use strndup, you can return the new string this way:
        //return strndup(str + start, end - start);
    
        char *new_str = malloc(end - start + 1);
        if (new_str) {
            size_t j = 0;  // index into the new string
            for (size_t i = start; i < end; i++) {
                new_str[j++] = str[i];
            }
            new_str[j] = '\0';
        }
        return new_str;
    }
    
    int main(void) {
        char s[] = " a b ";
        char *str = trim(s);
        printf("trim(\"%s\") -> \"%s\"\n", s, str);
        free(str);
        return 0;
    }