Search code examples
ciostdin

Reading everything currently entered in stdin


I want to read everything that is on stdin after 10 seconds and then break. The code I've been able to write so far is:

#include <stdio.h>
#include <stdlib.h>

int main() {
  sleep(10);
  char c;
  while (1) { // My goal is to modify this while statement to break after it has read everything.
    c = getchar();
    putchar(c);
  }
  printf("Everything has been read from stdin");
}

So when the letter "c" is entered before the 10 seconds have elapsed, it should print "c" (after sleep is done) and then "Everything has been read from stdin".

So far I have tried:

  • Checking if c is EOF -> getchar and similar functions never return EOF for stdin
  • Using a stat-type function on stdin -> stat-ing stdin always returns 0 for size (st_size).

Solution

  • Here's an offering that meets my interpretation of your requirements:

    • The program reads whatever data is typed (or otherwise entered) on standard input in a period of 10 seconds (stopping if you manage to enter 2047 characters — which would probably mean that the input is coming from a file or a pipe).
    • After 10 seconds, it prints whatever it has collected.
    • The alarm() call sets an alarm for an integral number of seconds hence, and the system generates a SIGALRM signal when the time is up. The alarm signal interrupts the read() system call, even if no data has been read.
    • The program stops without printing on receiving signals.
    • If the signal is one of SIGINT, SIGQUIT, SIGHUP, SIGPIPE, or SIGTERM, it stops without printing anything.
    • It fiddles with the terminal settings so that the input is unbuffered. This avoids it hanging around. It also ensures that system calls do not restart after a signal is received. That may not matter on Linux; using signal() on macOS Big Sur 11.7.1, the input continued after the alarm signal, which was not helpful — using sigaction() gives you better control.
    • It does its best to ensure that the terminal mode is restored on exit, but if you send an inappropriate signal (not one of those in the list above, or SIGALRM), you will have a terminal in non-canonical (raw) mode. That leads to confusion, in general.
    • It is easy to modify the program so that:
      • input is not echoed by the terminal driver;
      • characters are echoed by the program as they arrive (but beware of editing characters);
      • signals are not generated by the keyboard;
      • so it doesn't futz with standard input terminal attributes if it is not a terminal.

    Code

    /* SO 7450-7966 */
    #include <ctype.h>
    #include <signal.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <termios.h>
    #include <unistd.h>
    
    #undef sigemptyset      /* MacOS has a stupid macro that triggers -Wunused-value */
    
    static struct termios sane;
    
    static void stty_sane(void)
    {
        tcsetattr(STDIN_FILENO, TCSANOW, &sane);
    }
    
    static void stty_raw(void)
    {
        tcgetattr(STDIN_FILENO, &sane);
        struct termios copy = sane;
        copy.c_lflag &= ~ICANON;
        tcsetattr(STDIN_FILENO, TCSANOW, &copy);
    }
    
    static volatile sig_atomic_t alarm_recvd = 0;
    
    static void alarm_handler(int signum)
    {
        signal(signum, SIG_IGN);
        alarm_recvd = 1;
    }
    
    static void other_handler(int signum)
    {
        signal(signum, SIG_IGN);
        stty_sane();
        exit(128 + signum);
    }
    
    static int getch(void)
    {
        char c;
        if (read(STDIN_FILENO, &c, 1) == 1)
            return (unsigned char)c;
        return EOF;
    }
    
    static void set_handler(int signum, void (*handler)(int signum))
    {
        struct sigaction sa = { 0 };
        sa.sa_handler = handler;
        sigemptyset(&sa.sa_mask);
        sa.sa_flags = 0;    /* No SA_RESTART! */
        if (sigaction(signum, &sa, NULL) != 0)
        {
            perror("sigaction");
            exit(EXIT_FAILURE);
        }
    }
    
    static void dump_string(const char *tag, const char *buffer)
    {
        printf("\n%s [", tag);
        int c;
        while ((c = (unsigned char)*buffer++) != '\0')
        {
            if (isprint(c) || isspace(c))
                putchar(c);
            else
                printf("\\x%.2X", c);
        }
        printf("]\n");
    }
    
    int main(void)
    {
        char buffer[2048];
    
        stty_raw();
        atexit(stty_sane);
        set_handler(SIGALRM, alarm_handler);
        set_handler(SIGHUP, other_handler);
        set_handler(SIGINT, other_handler);
        set_handler(SIGQUIT, other_handler);
        set_handler(SIGPIPE, other_handler);
        set_handler(SIGTERM, other_handler);
        alarm(10);
    
        size_t i = 0;
        int c;
        while (i < sizeof(buffer) - 1 && !alarm_recvd && (c = getch()) != EOF)
        {
            if (c == sane.c_cc[VEOF])
                break;
            if (c == sane.c_cc[VERASE])
            {
                if (i > 0)
                    i--;
            }
            else
                buffer[i++] = c;
        }
        buffer[i] = '\0';
    
        dump_string("Data", buffer);
        return 0;
    }
    

    Compilation:

    gcc -O3 -g -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes -Wstrict-prototypes -fno-common tensec53.c -o tensec53 
    

    No errors (or warnings, but warnings are converted to errors).

    Analysis

    • The #undef line removes any macro definition of sigemptyset() leaving the compiler calling an actual function. The C standard requires this to work (§7.1.4 ¶1). On macOS, the macro is #define sigemptyset(set) (*(set) = 0, 0) and GCC complains, not unreasonably, about the "right-hand operand of comma expression has no effect". The alternative way of fixing that warning is to test the return value from sigemptyset(), but that's arguably sillier than the macro. (Yes, I'm disgruntled about this!)
    • The sane variable records the value of the terminal attributes when the program starts — it is set by calling tcgetattr() in stty_raw(). The code ensures that sane is set before activating any code that will call sttr_sane().
    • The stty_sane() function resets the terminal attributes to the sane state that was in effect when the program started. It is used by atexit() and also by the signal handlers.
    • The stty_raw() function gets the original terminal attributes, makes a copy of them, modifies the copy to turn off canonical processing (see Canonical vs non-canonical terminal input for more details), and sets the revised terminal attributes.
    • Standard C says you can't do much in a signal handler function than set a volatile sig_atomic_t variable, call signal() with the signal number, or call one of the exit functions. POSIX is a lot more gracious — see How to avoid using printf() in a signal handler? for more details.
    • There are two signal handlers, one for SIGALRM and one for the other signals that are trapped.
    • The alarm_handler() ignores further alarm signals and records that it was invoked.
    • The other_handler() ignores further signals of the same type, resets the terminal attributes to the sane state, and exits with a status used to report that a program was terminated by a signal (see POSIX shell Exit status for commands).
    • The getch() function reads a single character from standard input, mapping failures to EOF. The cast ensures that the return value is positive like getchar() does.
    • The set_handler() function uses sigaction() to set the signal handling. Using signal() in the signal handlers is a little lazy, but adequate. It ensures that the SA_RESTART bit is not set, so that when a signal interrupts a system call, it returns with an error rather than continuing.
    • The dump_string() function writes out a string with any non-printable characters other than space characters reported as a hex escape.
    • The main() function sets up the terminal, ensures that the terminal state is reset on exit (atexit() and the calls to set_handler() with the other_handler argument), and sets an alarm for 10 seconds hence.
    • The reading loop avoids buffer overflows and stops when the alarm is received or EOF (error) is detected.
    • Because canonical processing is turned off, there is no line editing. The body of the loop provides primitive line editing — it recognizes the erase (usually backspace '\b', sometimes delete '\177') character and the EOF character and handles them appropriately, otherwise adding the input to the buffer.
    • When the loop exits, usually because the alarm went off, it null terminates the string and then calls dump_string() to print what was entered.
    • If you wanted sub-second intervals, you would need to use the POSIX timer_create(), timer_delete(), timer_settime() (and maybe timer_gettime() and timer_getoverrun()) functions, which take struct timespec values for the time values. If they're not available, you might use the obsolescent setitimer() and getitimer() functions instead. The timer_create() step allows you to specify which signal will be sent when the timer expires — unlike alarm() and setitimer() which both send pre-determined signals.

    POSIX functions and headers: