Using find . -print0
seems to be the only safe way of obtaining a list of files in bash due to the possibility of filenames containing spaces, newlines, quotation marks etc.
However, I'm having a hard time actually making find's output useful within bash or with other command line utilities. The only way I have managed to make use of the output is by piping it to perl, and changing perl's IFS to null:
find . -print0 | perl -e '$/="\0"; @files=<>; print $#files;'
This example prints the number of files found, avoiding the danger of newlines in filenames corrupting the count, as would occur with:
find . | wc -l
As most command line programs do not support null-delimited input, I figure the best thing would be to capture the output of find . -print0
in a bash array, like I have done in the perl snippet above, and then continue with the task, whatever it may be.
How can I do this?
This doesn't work:
find . -print0 | ( IFS=$'\0' ; array=( $( cat ) ) ; echo ${#array[@]} )
A much more general question might be: How can I do useful things with lists of files in bash?
Shamelessly stolen (with some changes) from Greg's BashFAQ:
a=()
while IFS= read -r -d '' file; do
a+=("$file") # or however you want to process each file
done < <(find /tmp -type f -print0)
Note that the redirection construct used here (cmd1 < <(cmd2)
) is similar to, but not quite the same as the more usual pipeline (cmd2 | cmd1
) -- if the commands are shell builtins (e.g. while
), the pipeline version executes them in subshells, and any variables they set (e.g. the array a
) are lost when they exit. cmd1 < <(cmd2)
only runs cmd2 in a subshell, so the array lives past its construction. Warning: this form of redirection is only available in bash, not even bash in sh-emulation mode; you must start your script with #!/bin/bash
.
Also, because the file processing step (in this case, just a+=("$file")
, but you might want to do something fancier directly in the loop) has its input redirected, it cannot use any commands that might read from stdin. To avoid this limitation, I tend to use:
a=()
while IFS= read -r -d '' file <&3; do
a+=("$file") # or however you want to process each file
done 3< <(find /tmp -type f -print0)
...which passes the file list via unit 3, rather than stdin.