Here's the description of gets()
from Prata's C Primer Plus:
It gets a string from your system's standard input device, normally your keyboard. Because a string has no predetermined length,
gets()
needs a way to know when to stop. Its method is to read characters until it reaches a newline (\n
) character, which you generate by pressing the Enter key. It takes all the characters up to (but not including) the newline, tacks on a null character (\0
), and gives the string to the calling program.
It got my curious as to what would happen when gets()
reads in just a newline. So I wrote this:
int main(void)
{
char input[100];
while(gets(input))
{
printf("This is the input as a string: %s\n", input);
printf("Is it the string end character? %d\n", input == '\0');
printf("Is it a newline string? %d\n", input == "\n");
printf("Is it the empty string? %d\n", input == "");
}
return 0;
}
Here's my interaction with the program:
$ ./a.out
This is some string
This is the input as a string: This is some string
Is it the string end character? 0
Is it a newline string? 0
Is it the empty string? 0
This is the input as a string:
Is it the string end character? 0
Is it a newline string? 0
Is it the empty string? 0
The second block is really the thing of interest, when all I press is enter. What exactly is input
in that case? It doesn't seem to be any of my guesses of: \0
or \n
or ""
.
This part in the description of gets
might be confusing:
It takes all the characters up to (but not including) the newline
It might be better to say that it takes all the characters including the newline but stores all characters not including the newline.
So if the user enters some string
, the gets
function will read some string
and the newline character from the user's terminal, but store only some string
in the buffer - the newline character is lost. This is good, because no one wants the newline character anyway - it's a control character, not a part of the data that user wanted to enter.
Therefore, if you only press enter, gets
interprets it as an empty string. Now, as noted by some people, your code has multiple bugs.
printf("This is the input as a string: %s\n", input);
No problem here, though you might want to delimit your string by some artificial characters for better debugging:
printf("This is the input as a string: '%s'\n", input);
printf("Is it the string end character? %d\n", input == '\0');
Not good: you want to check 1 byte here, not the whole buffer. If you try to compare the whole buffer with 0, the answer is always false
because the compiler converts \0
to NULL
and interprets the comparison like "does the buffer exist at all?".
The right way is:
printf("Does the first byte contain the string end character? %d\n", input[0] == '\0');
This compares just 1 byte to \0
.
printf("Is it a newline string? %d\n", input == "\n");
Not good: this compares the address of the buffer with the address of "\n"
- the answer is always false
. The right way to compare string in C is strcmp
:
printf("Is it a newline string? %d\n", strcmp(input, "\n") == 0);
Note the peculiar usage: strcmp
returns 0 when the strings are equal.
printf("Is it the empty string? %d\n", input == "");
The same bug here. Use strcmp
here too:
printf("Is it the empty string? %d\n", strcmp(input, "") == 0);
BTW as people always say, gets
cannot be used in a secure way, because it doesn't support protection from buffer overflow. So you should use fgets
instead, even though it's less convenient:
char input[100];
while (fgets(input, sizeof input, stdin))
{
...
}
This leads to possible confusion: fgets
doesn't delete the newline byte from the input it reads. So if you replace gets
in your code by fgets
, you will get different results. Fortunately, your code will illustrate the difference in a clear way.