Search code examples
cunixpipefile-descriptor

C: Cannot read from 2 pipe file descriptors without losing the other one


I have C code that is simulating the bash command ls | wc. One of the things I want to achieve is to be able to to read the output of each command so I can print them - both ls and wc when ls is piped. This issue that I'm facing is whenever I read one of the commands is I somehow lose the other.

Observe the following code

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

#define LS_PATH "/bin/ls"
#define WC_PATH "/usr/bin/wc"

int main()
{
    pid_t pid;

    int link[2], link2[2];

    char *const arg1[] = {"ls", NULL};
    char *const arg2[] = {"wc", NULL};

    char *buffer1[4096], buffer2[4096];

    pipe(link);
    pipe(link2);

    pid = fork();

    if (pid == 0)
    {
        dup2(link[1], STDOUT_FILENO);
        close(link[0]);
        close(link[1]);
        execv(LS_PATH, arg1);
        perror("error1");
    }
    else
    {

        pid = fork();

        if (pid == 0)
        {
            dup2(link[0], STDIN_FILENO);
            dup2(link2[1], STDOUT_FILENO);
            close(link[1]);
            close(link[0]);
            close(link2[1]);
            close(link2[0]);
            execv(WC_PATH, arg2);
            perror("error2");
        }
        else
        {

            close(link[1]);

            close(link2[1]);

            // the following two lines of code is the point of interest
            read(link[0], buffer1, sizeof(buffer1));   // ls
            read(link2[0], buffer2, sizeof(buffer2));  // wc

            printf("%s\n", buffer1);

            printf("%s\n", buffer2);
        }
    }
}

Focus mainly on the following code statements:

read(link[0], buffer1, sizeof(buffer1));   // ls
read(link2[0], buffer2, sizeof(buffer2));  // wc

ls is being read first into buffer1, it prints fine, but then buffer2 which is reading wc just returns 0.

If I were to switch the ordering of the above code statements such that I would have the following:

read(link2[0], buffer2, sizeof(buffer2));  // wc
read(link[0], buffer1, sizeof(buffer1));   // ls

Then wc being read into buffer2 works fine as if I ran the command ls | wc in the terminal, however, ls in buffer1 would not print.

I can't get both, only one or the other.

How do I get both?


Solution

  •     if (pid == 0)
        {
            dup2(link[0], STDIN_FILENO); // ** HERE **
            dup2(link2[1], STDOUT_FILENO);
            close(link[1]);
            close(link[0]);
            close(link2[1]);
            close(link2[0]);
            execv(WC_PATH, arg2);
            perror("error2");
        }
    

    Notice that wc reads from link[0].

            // the following two lines of code is the point of interest
            read(link[0], buffer1, sizeof(buffer1));   // ls   ** HERE **
            read(link2[0], buffer2, sizeof(buffer2));  // wc
    

    Notice that the parent process also tries to read from link[0].

    You can't get both because data written to a pipe can only be read from it once. If wc reads ls's output to count its characters, then you can't also read ls's output in the parent. If the parent reads ls's output out of the pipe, then wc can't get it also because it's not in the pipe anymore. That's why you can only get one or the other -- if one thing reads the pipe, then it's not in the pipe for the other to read.

    There are a lot of ways you can fix this, depending on your requirements.

    1. You can use three pipes with the extra pipe being wc's standard input. The parent process would have to copy data from pipe that has ls's output to the pipe that is wc's input.

    2. You can add a third process, an instance of tee that duplicates the data.

    3. You can have ls write to a temporary file. Both wc and your parent process can read that file. Files allow the same data to be read more than once, unlike pipes.