Search code examples
arraysclanguage-lawyerfgets

Why does fgets() accept (signed) int for its 'count' argument?


The standard function fgets is specified this way in the upcoming C23 Standard:

7.23.7.2 The fgets function

Synopsis

    #include <stdio.h>
    char *fgets(char * restrict s, int n, FILE * restrict stream);

The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.

Returns

The fgets function returns s if successful. If end-of-file is encountered and no characters have been read into the array, the contents of the array remain unchanged and a null pointer is returned. If a read error occurs during the operation, the members of the array have unspecified values and a null pointer is returned.

This specification is unchanged since the original ANSI-C document, except for the restrict keywords added in C99.

The type specified for n is int, which is inconsistent with all other C library functions taking an array length or an object count where the type is size_t. This problem cannot be fixed without potentially breaking programs that use function pointers pointing to fgets or that pass negative values for n. fread and fwrite, along with many other standard functions used to have their numeric arguments and return values specified as int in the original Unix documents, but were changed before ANSI-C, which begs the question: Why was fgets left out of this update, was there any compelling reason to treat fgets() differently then?

Furthermore, some cases do not seem to have specified behavior:

  • if n <= 1, should fgets() return s without trying to read from the stream?
  • if n == 1, should s[0] be set to a null character albeit no characters have been read into the array?
  • if n <= 0, can we pass a null pointer for s?

Solution

  • Regarding the behavior for n <= 1:

    There is no formal definition of successful in the context of this specification, and the only cases of failure that come to mind relate to reading from the stream and getting an error or reaching the end of file, both cases addressed in the specification.

    Since fgets() reads at most one less than the number of characters specified by n, it should not read any characters for n <= 1, hence not attempt any input from the stream. Therefore, the fgets() should return s in this case.

    It seems unspecified if s must be modified and s[0] set to a null character if n == 1, albeit all C libraries for which I have access to the source code do behave this way.

    Because of this potential problem, It seems recommended to avoid calling fgets() with a value below 2 for argument n.

    Note that snprintf has a similar problem:

    7.23.6.5 The snprintf function

    Synopsis

        #include <stdio.h>
        int snprintf(char * restrict s, size_t n, const char * restrict format, ...);
    

    The snprintf function is equivalent to fprintf, except that the output is written into an array (specified by argument s) rather than to a stream. If n is zero, nothing is written, and s may be a null pointer. Otherwise, output characters beyond the n-1st are discarded rather than being written to the array, and a null character is written at the end of the characters actually written into the array. If copying takes place between objects that overlap, the behavior is undefined.

    Returns

    The snprintf function returns the number of characters that would have been written had n been sufficiently large, not counting the terminating null character, or a negative value if an encoding error occurred. Thus, the null-terminated output has been completely written if and only if the returned value is both nonnegative and less than n.

    The sentence and a null character is written at the end of the characters actually written into the array is somewhat ambiguous if no characters have actually been written to the array, either because the output is empty or because it has been completely discarded.

    Despite this ambiguity, it seems impossible to argue that snprintf given an empty format string may not produce an empty string in s provided n is non zero.

    It would be helpful to use a more precise specification of both snprintf and fgets regarding the case of n == 1, ensuring that s[0] be set to a null character.