I have a shell script
find . -name "*.java" -print0 | xargs -0 grep -Lz 'regular_expression'
which outputs file names not matching the regexp in this way:
file1.java
file2.java
...
The way I understand, it works as follows: find
find needed files and concatenate their names with \0
. Then xargs
split the output of find
with \0
and feeds them to grep
one-by-one.
Then I wanted to add one more stage and get only basename
of the files. I modified the script:
find . -name "*.java" -print0 | xargs -0 grep -LzZ 'regular_expression' | xargs -0 basename
but got an error. I started investigating and made an temporary output:
find . -name "*.java" -print0 | xargs -0 grep -LzZ 'regular_expression' | xargs -0 echo basename
and got this:
basename ./file1.java ./file2.java ./subdir/file1.java ./subdir/file2.java
So, the filenames were not split by \0
. I can't get why they are split in case of xargs
used with grep
and not split in xargs
with basename
.
I got a workaround by using -n1
in the latter xargs
. But still I don't understand why I needed it (given I didn't use in in the xargs
with grep
) and what this parameter does.
Hope you can explain to me what -n1
does and why I needed it in the latter usage and didn't need it in the former with grep
.
The filenames were split by \0
. The difference is in the commands you're using. xargs
normally takes its standard input, breaks it into a list (here, by splitting on NUL), and then passes that list as extra arguments to your command. So when you do this:
find . -name "*.java" -print0 | xargs -0 grep -Lz 'regular_expression'
What actually runs is this:
grep -Lz 'regular_expression' file1.java file2.java file3.java...
Here, the -z
doesn't matter because it only affects how grep
reads stdin, and you're not sending anything to its stdin.
So, when you add another xargs
that runs basename
, you get this:
basename file1.java file2.java file3.java...
But while grep
will take any number of filename arguments, basename
only takes one and ignores the others.
That's where -n 1
comes in: it tells xargs
to break its list of arguments into chunks (of 1), and run the command multiple times. So what runs now is:
basename file1.java
basename file2.java
basename file3.java
...
And all the output is concatenated together onto stdout.