I have C code that is simulating the bash command ls | wc
. One of the things I want to achieve is to be able to to read the output of each command so I can print them - both ls
and wc
when ls
is piped. This issue that I'm facing is whenever I read one of the commands is I somehow lose the other.
Observe the following code
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#define LS_PATH "/bin/ls"
#define WC_PATH "/usr/bin/wc"
int main()
{
pid_t pid;
int link[2], link2[2];
char *const arg1[] = {"ls", NULL};
char *const arg2[] = {"wc", NULL};
char *buffer1[4096], buffer2[4096];
pipe(link);
pipe(link2);
pid = fork();
if (pid == 0)
{
dup2(link[1], STDOUT_FILENO);
close(link[0]);
close(link[1]);
execv(LS_PATH, arg1);
perror("error1");
}
else
{
pid = fork();
if (pid == 0)
{
dup2(link[0], STDIN_FILENO);
dup2(link2[1], STDOUT_FILENO);
close(link[1]);
close(link[0]);
close(link2[1]);
close(link2[0]);
execv(WC_PATH, arg2);
perror("error2");
}
else
{
close(link[1]);
close(link2[1]);
// the following two lines of code is the point of interest
read(link[0], buffer1, sizeof(buffer1)); // ls
read(link2[0], buffer2, sizeof(buffer2)); // wc
printf("%s\n", buffer1);
printf("%s\n", buffer2);
}
}
}
Focus mainly on the following code statements:
read(link[0], buffer1, sizeof(buffer1)); // ls
read(link2[0], buffer2, sizeof(buffer2)); // wc
ls
is being read first into buffer1
, it prints fine, but then buffer2
which is reading wc
just returns 0.
If I were to switch the ordering of the above code statements such that I would have the following:
read(link2[0], buffer2, sizeof(buffer2)); // wc
read(link[0], buffer1, sizeof(buffer1)); // ls
Then wc
being read into buffer2
works fine as if I ran the command ls | wc
in the terminal, however, ls
in buffer1
would not print.
I can't get both, only one or the other.
How do I get both?
if (pid == 0)
{
dup2(link[0], STDIN_FILENO); // ** HERE **
dup2(link2[1], STDOUT_FILENO);
close(link[1]);
close(link[0]);
close(link2[1]);
close(link2[0]);
execv(WC_PATH, arg2);
perror("error2");
}
Notice that wc
reads from link[0]
.
// the following two lines of code is the point of interest
read(link[0], buffer1, sizeof(buffer1)); // ls ** HERE **
read(link2[0], buffer2, sizeof(buffer2)); // wc
Notice that the parent process also tries to read from link[0]
.
You can't get both because data written to a pipe can only be read from it once. If wc
reads ls
's output to count its characters, then you can't also read ls
's output in the parent. If the parent reads ls
's output out of the pipe, then wc
can't get it also because it's not in the pipe anymore. That's why you can only get one or the other -- if one thing reads the pipe, then it's not in the pipe for the other to read.
There are a lot of ways you can fix this, depending on your requirements.
You can use three pipes with the extra pipe being wc
's standard input. The parent process would have to copy data from pipe that has ls
's output to the pipe that is wc
's input.
You can add a third process, an instance of tee
that duplicates the data.
You can have ls
write to a temporary file. Both wc
and your parent process can read that file. Files allow the same data to be read more than once, unlike pipes.