I have a C program compiled with 3 .c files. Essentially, that program prints out squares to the standard output based on x and y size input which I have defined in the main. The relevant code is below:
void rush(int x, int y);
int main(void)
{
rush(3, 3);
return (0);
}
running the executable of the main like so:
./a.out
gives the following:
o-o
| |
o-o
and changing the parameters passed to the rush function to (5, 5) yields the following:
o---o
| |
| |
| |
o---o
You get the idea. Each line is delimited by a \n which allows the function to print the proper next line. I have another test program which is a simple compiled main that simply prints the the value of ARGC as I wanted to test the behavior of what piping such an input would give. The second main program is like so:
#include <stdio.h>
int main(int argc, char **argv)
{
printf("argc value is: %d\n", argc);
return (0);
}
Running the following commands:
./a.out | ./test
I get the following output:
argc value is: 1
Which didn't make sense to me initially, but then I remembered it was because some commands require xargs to accept input properly from stdin. Using xargs with (5, 5) as input in the main:
./a.out | xargs ./test
resulted in:
argc value is: 9
Thus I have two questions. Is there a way to do this without needing xargs and can be done in the c files themselves? And knowing the input to the test file, why is argc == 9? How does the program separate out a string in that format and decide what to put in the array?
This will be long, so grab your favourite drink. Don't just skip to the answers after the break.
First, examine the command-line arguments supplied to a program, say args.c:
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
int i;
printf("argc = %d\n", argc);
for (i = 0; i < argc; i++)
printf("argv[%d] = \"%s\"\n", i, argv[i]);
return EXIT_SUCCESS;
}
Compile that using your favourite C compiler; I use gcc:
gcc -Wall -O2 args.c -o args
If you run say
./args one two
it will output
argc = 3
argv[0] = "./args"
argv[1] = "one"
argv[2] = "two"
All Unixes have a command line utility or shell built-in printf
that works much like the C printf()
standard library function does. We can run for example
printf 'Hello, world!\nSecond line\nThird line\n'
and we'll see
Hello, world!
Second line
Third line
Now, if we connect the two with a pipe,
printf 'Hello, world!\nSecond line\nThird line\n' | ./args
we get
argc = 1
argv[0] = "./args"
because there were no parameters to ./args
, and the above args.c ignores standard input completely.
The xargs
utility command reads the input to it, and then executes its own command-line arguments as a command, adding the input it reads as additional parameters. It is highly configurable, too. If you run
printf 'Hello, world!\nSecond line\nThird line\n' | xargs ./args
you'll get
argc = 7
argv[0] = "./args"
argv[1] = "Hello,"
argv[2] = "world!"
argv[3] = "Second"
argv[4] = "line"
argv[5] = "Third"
argv[6] = "line"
because xargs turns each token in the input, separated by whitespace, into a command line argument. If we tell xargs to turn each input line to a separate argument, by using the -d SEPARATOR
option, with newline as the separator:
printf 'Hello, world!\nSecond line\nThird line\n' | xargs -d '\n' ./args
we get
argc = 4
argv[0] = "./args"
argv[1] = "Hello, world!"
argv[2] = "Second line"
argv[3] = "Third line"
If we tell xargs to add at most two arguments per command executed, by adding the -n 2
option,
printf 'Hello, world!\nSecond line\nThird line\n' | xargs -d '\n' -n 2 ./args
we'll get
argc = 3
argv[0] = "./args"
argv[1] = "Hello, world!"
argv[2] = "Second line"
argc = 2
argv[0] = "./args"
argv[1] = "Third line"
This output means that our ./args
got actually executed twice. First was effectively ./args 'Hello, world!' 'Second line'
, and the second was ./args 'Third line'
.
Another important option to xargs is -r
, which tells it to not run the command without any additional arguments:
true | xargs -r ./args
does not output anything, because xargs sees no input, and the -r
option tells it to not run our args program if there are no additional arguments.
When manipulating file names or paths, the -0
(dash zero) option tells xargs that the input separator is the nul character, \0
, which in C delimits strings. If we use that in the input to xargs, even strings with newlines and such will be correctly split into arguments. For example:
printf 'One thing\non two lines\0Second thing' | xargs -0 ./args
will output
argc = 3
argv[0] = "./args"
argv[1] = "One thing
on two lines"
argv[2] = "Second thing"
which is exactly what one would want, if processing file names or paths in a robust manner.
Is there a way to do this without needing xargs and can be done in the c files themselves?
Of course: just read standard input. xargs is almost certainly written in C itself on all Unixy systems.
How does [xargs] separate out a string in that format and decide what to put in the array?
The short answer is that it depends on the options used, because xargs is a pretty powerful little tool.
The full answer is, look at the sources. The source for the GNU xargs (part of findutils) is here, and the source for FreeBSD version is here.
The code answer depends on whether you can use POSIX.1 or not, specifically getline()
or getdelim()
. If you have a single-character separator (be it any single-byte character at all, even nul), you can use getdelim()
to reach each "parameter" from the input as a separate string. This is what I'd do, but it is not unix, it is a posix solution. (Nowadays, if you have a maintained Unixy computer, it is almost certain to have POSIX.1 support in its C library built-in.)
Why is argc == 9?
If we duplicate your input using printf 'o---o\n| |\n| |\n| |\no---o\n'
and pipe it to xargs ./args
, the output is as expected,
argc = 9
argv[0] = "./args"
argv[1] = "o---o"
argv[2] = "|"
argv[3] = "|"
argv[4] = "|"
argv[5] = "|"
argv[6] = "|"
argv[7] = "|"
argv[8] = "o---o"
i.e. each part of your ascii art separated at whitespace, and supplied as a command-line parameter. If we pipe it to xargs -d '\n' ./args
, the output is
argc = 6
argv[0] = "./args"
argv[1] = "o---o"
argv[2] = "| |"
argv[3] = "| |"
argv[4] = "| |"
argv[5] = "o---o"
If you had written that initial args.c program for yourself, you probably could have found the answer to your questions yourself via exploration. That is what makes programming so powerful: you can write tools that help you understand the problems you wish to solve. Applying the Unix philosophy and the KISS principle means those tools are often quite simple to write as well. Just write them well in the first place, so you can trust their results, and don't need to rewrite them too often.