Search code examples
linuxshellxargs

xargs inconsistent behavior and -n1 parameter


I have a shell script

find . -name "*.java" -print0 | xargs -0 grep -Lz 'regular_expression'

which outputs file names not matching the regexp in this way:

file1.java
file2.java
...

The way I understand, it works as follows: find find needed files and concatenate their names with \0. Then xargs split the output of find with \0 and feeds them to grep one-by-one.

Then I wanted to add one more stage and get only basename of the files. I modified the script:

find . -name "*.java" -print0 | xargs -0 grep -LzZ 'regular_expression' | xargs -0 basename

but got an error. I started investigating and made an temporary output:

find . -name "*.java" -print0 | xargs -0 grep -LzZ 'regular_expression' | xargs -0  echo basename

and got this:

basename ./file1.java ./file2.java ./subdir/file1.java ./subdir/file2.java

So, the filenames were not split by \0. I can't get why they are split in case of xargs used with grep and not split in xargs with basename.

I got a workaround by using -n1 in the latter xargs. But still I don't understand why I needed it (given I didn't use in in the xargs with grep) and what this parameter does.

Hope you can explain to me what -n1 does and why I needed it in the latter usage and didn't need it in the former with grep.


Solution

  • The filenames were split by \0. The difference is in the commands you're using. xargs normally takes its standard input, breaks it into a list (here, by splitting on NUL), and then passes that list as extra arguments to your command. So when you do this:

    find . -name "*.java" -print0 | xargs -0 grep -Lz 'regular_expression'
    

    What actually runs is this:

    grep -Lz 'regular_expression' file1.java file2.java file3.java...
    

    Here, the -z doesn't matter because it only affects how grep reads stdin, and you're not sending anything to its stdin.

    So, when you add another xargs that runs basename, you get this:

    basename file1.java file2.java file3.java...
    

    But while grep will take any number of filename arguments, basename only takes one and ignores the others.

    That's where -n 1 comes in: it tells xargs to break its list of arguments into chunks (of 1), and run the command multiple times. So what runs now is:

    basename file1.java
    basename file2.java
    basename file3.java
    ...
    

    And all the output is concatenated together onto stdout.