Search code examples
ceofkernighan-and-ritchie

getchar() with EOF not behaving as expected


Working my way through K&R I stumbled uppon this unexpected behaviour. Consider following code:

#include <stdio.h>

#define MAXWLEN         10  /* maximum word length      */
#define MAXHISTWIDTH    10  /* maximum histogram width  */
#define IN              1   /* inside a word            */
#define OUT             0   /* outside a word           */

int main()
{
    int c, i, state;
    int wlen[MAXWLEN];

    for (i = 0; i < MAXWLEN; ++i)
        wlen[i] = 0;

    i = 0;                  /* length of currend word   */
    state = OUT;            /* start outside of words   */
    while ((c = getchar()) != EOF)
    {
        if (c == ' ' || c == '\t' || c == '\n')
        {
            state = OUT;
            if (i > 0 && i < MAXWLEN)
                ++wlen[i];
            i = 0;
        }
        else if (state == OUT)  /* beginning of word */
        {
            state = IN;
            i = 1;
        }
        else                    /* in word */
            ++i;
    }
    ++wlen[i];

    printf("\nwordlen\toccurences\n");
    for (i = 1; i < MAXWLEN; ++i)
    {
        printf("%6d:\t", i);
        if (wlen[i] > MAXHISTWIDTH)
            wlen[i] = MAXHISTWIDTH;
        for (int j = 0; j < wlen[i]; ++j)
            printf("#");
        printf("\n");
    }
}

This counts the length of all words in a given input and prints a histogram of the result. The Result is as expected.

But I have to press CTRL-D twice, if the last character I entered was not a newline-command (Enter). I'm running my program in zhs, compiled the file with cc.

Can somebody explain, why this happens or is it just an error that occurs on my machine?


Solution

  • This is not behaviour of your program but rather terminal emulator.

    Terminal emulators usually buffer the input line by line and send the input to program in bulks. Most of them usually ignore Ctrl-D if pressed in the middle of the line and detect it only if you press it twice. Maybe they take it as signal to interrupt the buffering, not sure abiut it.