Search code examples
clinuxpipeeofstdio

How to properly fread & fwrite from & to a pipe


I have this code which acts as a pipe between two shell invocations.

It reads from a pipe, and writes into a different one.

#include <stdio.h>
#include <stdlib.h>


#define BUFF_SIZE (0xFFF)

/*
 *  $ cat /tmp/redirect.txt |less
 */
int main(void)
{
    FILE    *input;
    FILE    *output;
    int     c;
    char    buff[BUFF_SIZE];
    size_t  nmemb;

    input   = popen("cat /tmp/redirect.txt", "r");
    output  = popen("less", "w");
    if (!input || !output)
        exit(EXIT_FAILURE);

#if 01
    while ((c = fgetc(input))  !=  EOF)
        fputc(c, output);
#elif 01
    do {
        nmemb   = fread(buff, 1, sizeof(buff), input);
        fwrite(buff, 1, nmemb, output);
    } while (nmemb);
#elif 01
    while (feof(input) != EOF) {
        nmemb   = fread(buff, 1, sizeof(buff), input);
        fwrite(buff, 1, nmemb, output);
    }
#endif
/*
 * EDIT: The previous implementation is incorrect:
 * feof() return non-zero if EOF is set
 * EDIT2:  Forgot the !.  This solved the problem.
 */
#elif 01
    while (feof(input)) {
        nmemb   = fread(buff, 1, sizeof(buff), input);
        fwrite(buff, 1, nmemb, output);
    }
#endif

    pclose(input);
    pclose(output);

    return  0;
}

I want it to be efficient, so I want to implement it with fread()&fwrite(). There are the 3 way I tried.

The first one is implemented with fgetc()&fputc() so it will be very slow. However it works fine because it checks for EOF so it will wait until cat (or any shell invocation I use) finishes its job.

The second one is faster, but I'm concerned that I don't check for EOF so if there is any moment when the pipe is empty (but the shell invocation hasn't finished, so may not be empty in the future), it will close the pipe and end.

The third implementation is what I would like to do, and it relatively works (all the text is received by less), but for some reason it gets stuck and doesn't close the pipe (seems like it never gets the EOF).

EDIT: Third implementation is buggy. Fourth tries to solve the bug, but now less doesn't receive anything.

How should this be properly done?


Solution

  • First of all, to say that I think you are having problems more with buffering, than with efficiency. That is a common problem when first dealing with the stdio package.

    Second, the best (and simplest) implementation of a simple data copier from input to output is the following snippet (copied from K&R first ed.).

    while((c = fgetc(input)) != EOF) 
        fputc(c, output);
    

    (well, not a literal copy, as there, K&R use stdin and stdout as FILE* descriptors, and they use the simpler getchar(); and putchar(c); calls.) When you try to do better than this, normally you incur in some false assumptions, as the fallacy of the lack of buffering or the number of system calls.

    stdio does full buffering when the standard output is a pipe (indeed, it does full buffering always except when the file descriptor gives true to the isatty(3) function call), so you should do, in the case you want to see the output as soon as it is available, at least, no output buffering (with something like setbuf(out, NULL);, or fflush()) your output at some point, so it doesn't get buffered in the output while you are waiting in the input for more data.

    What it seems to be is that you see that the output for the less(1) program is not visible, because it is being buffered in the internals of your program. And that is exactly what is happening... suppose you feed your program (which, despite of the handling of individual characters, is doing full buffering) doesn't get any input until the full input buffer (BUFSIZ characters) have been feeded to it. Then, a lot of single fgetc() calls are done in a loop, with a lot of fputc() calls are done in a loop (exactly BUFSIZ calls each) and the buffer is filled at the output. But this buffer is not written, because it need one more char to force a flush. So, until you get the first two BUFSIZ chunks of data, you don't get anything written to less(1).

    A simple, and efficient way is to check after fputc(c, out); if the char is a \n, and flush output with fflush(out); in that case, and so you'll write a line of output at a time.

    fputc(c, out);
    if (c == '\n') fflush(out);
    

    If you don't do something, the buffering is made in BUFSIZ chunks, and normally, not before you have such an amount of data in the output side. And remember always to fclose() things (well, this is handled by stdio), or you can lose output in case your process gets interrupted.

    IMHO the code you should use is:

    while ((c = fgetc(input))  !=  EOF) {
        fputc(c, output);
        if (c == '\n') fflush(output);
    }
    fclose(input);
    fclose(output);
    

    for the best performance, while not blocking unnecessarily the output data in the buffer.

    BTW, doing fread() and fwrite() of one char, is a waste of time and a way to complicate things a lot (and error prone). fwrite() of one char will not avoid the use of buffers, so you won't get more performance than using fputc(c, output);.

    BTW(bis) if you want to do your own buffering, don't call stdio functions, just use read(2) and write(2) calls on normal system file descriptors. A good approach is:

    int input_fd = fileno(input); /* input is your old FILE * given by popen() */
    int output_fd = fileno(output);
    
    while ((n = read(input_fd, your_buffer, sizeof your_buffer)) > 0) {
        write(output_fd, your_buffer, n);
    }
    switch (n) {
    case 0: /* we got EOF */
        ...
        break;
    default: /* we got an error */
        fprintf(stderr, "error: read(): %s\n", strerror(errno));
        ...
        break;
    } /* switch */
    

    but this will awaken your program only when the buffer is fully filled with data, or there's no more data.

    If you want to feed your data to less(1) as soon as you have one line for less, then you can disable completely the input buffer with:

    setbuf(input, NULL);
    int c; /* int, never char, see manual page */
    while((c == fgetc(input)) != EOF) {
        putc(c, output);
        if (c == '\n') fflush(output);
    }
    

    And you'll get less(1) working as soon as you have produced a single line of output text.

    What are you exactly trying to do? (This would be nice to know, as you seem to be reinventing the cat(1) program, but with reduced functionality)