Search code examples
ccasting

How do I use tolower() with a char array?


I'm learning C in school and am doing some input and string comparisons and am running into what looks like a casting issue.

Here is my code:

size_t unit_match_index(char *userInput) {
    char* unit = malloc(strlen(userInput) + 1);
    strcpy(unit, userInput);
    
    //convert to lowercase
    for (size_t i = 0; i < strlen(unit); ++i) {
        unit[i] = tolower(unit[i]);
    /*C6385: invalid data from 'unit': the readable size is 'strlen(userInput)+1' bytes, but 2bytes may be read
      C6386: buffer overrun while writing to 'unit': the writable size is [same as above]
    */
    }
//...
}

After doing a little bit of research, it looks like tolower() looks for an int and returns an int (2 bytes), and thinks strlen(userInput)+1 may equate to 1, making the total unit array size only 1 byte.

Is there something I should be doing to avoid this, or is this just the analyzer being a computer (computers are dumb)? I'm concerned because I will lose marks on my assignment if there are errors.


Solution

  • As suggested in an answer to this related question, these two warnings are caused by a "bug" in the MSVC Code Analyser.

    I even tried the 'fix' I suggested in my answer to that question (that is, using char* unit = malloc(max(strlen(userInput), 0) + 1);) – but it didn't work in your code (not sure why).

    However, what did work (and I have no idea why) is to use the strdup function in place of your calls to malloc and strcpy – it does the same thing but in one fell swoop.

    Adding the casts (correctly)1 suggested in the comments, here's a version of your code that doesn't generate the spurious C6385 and C6386 warnings:

    #include <stdlib.h>
    #include <string.h>
    #include <ctype.h>
    
    size_t unit_match_index(char* userInput)
    {
        char* unit = strdup(userInput);
        //convert to lowercase
        for (size_t i = 0; i < strlen(unit); ++i) {
            unit[i] = (char)tolower((unsigned char)unit[i]);
        }
        //...
        return 0;
    }
    

    However, MSVC will now generate a different (but equally spurious) warning:

    warning C4996: 'strdup': The POSIX name for this item is deprecated. Instead, use the ISO C and C++ conformant name: _strdup. See online help for details.

    As it happens, the strdup function (without the leading underscore) is adopted as part of the ISO Standard since C23 (6/2019).


    1 On the reasons for the casts when using the tolower function, see: Do I need to cast to unsigned char before calling toupper(), tolower(), et al.?. However, simply adding those casts does not silence the two code analysis warnings.