Search code examples
linuxshellfind

Why find command doesn't pass all file as input to the -exec parameter?


There are all the files and symbolic links in the directory:

./02.tdf/IP_B186.tdf
./02.tdf/In_B186.tdf
./03.bed/IP_B186_vs_In_B186_peaks.xls.promoter_peak.bed
./03.bed/IP_B186_vs_In_B186_peaks.xls.bed
./01.genome/Harukei3_v1.41.fa
./01.genome/Harukei3_v1.41.gene.bed
./md5.txt

when I run find . -type f -o -type l -exec md5sum {} + | sort > md5.txt I will get:

cat md5.txt
331a1bebdaf8b09fbdb6468b9c53bac3  ./01.genome/Harukei3_v1.41.fa
6354c1dfb5b0aff9620712ec17c271e3  ./01.genome/Harukei3_v1.41.gene.bed

When I run find . -type f -o -type l | while read i; do md5sum $i >> md5.txt; done I wil get:

cat md5.txt
d9eb8d2ad4f0c03ca1dc9628e3562f01  ./02.tdf/IP_B186.tdf
c72eda337f2d75301baf125d6d64a5bc  ./02.tdf/In_B186.tdf
3ea65045235ed51efcc88af49d126a60  ./03.bed/IP_B186_vs_In_B186_peaks.xls.promoter_peak.bed
954484488dce3f5f18b7c06f6693f223  ./03.bed/IP_B186_vs_In_B186_peaks.xls.bed
331a1bebdaf8b09fbdb6468b9c53bac3  ./01.genome/Harukei3_v1.41.fa
6354c1dfb5b0aff9620712ec17c271e3  ./01.genome/Harukei3_v1.41.gene.bed

Thed second result is what I want, I want to know why the first script gives a so strang output. Could someone help me ? Many thanks.

I also try find . -type f -o -type l -exec md5sum {} \; | sort > md5.txt, but it gives the same output that I don't want.


Solution

  • The standard for find states (formatting mine):

    The primaries can be combined using the following operators (in order of decreasing precedence):

    • ( expression )
      • True if expression is true.
    • ! expression
      • Negation of a primary; the unary NOT operator.
    • expression [ -a ] expression
      • Conjunction of primaries; the AND operator is implied by the juxtaposition of two primaries or made explicit by the optional -a operator. The second expression shall not be evaluated if the first expression is false.
    • expression -o expression
      • Alternation of primaries; the OR operator. The second expression shall not be evaluated if the first expression is true.

    If no expression is present, -print shall be used as the expression. Otherwise, if the given expression does not contain any of the primaries -exec, -ok, or -print, the given expression shall be effectively replaced by:

    ( given_expression ) -print


    find . -type f -o -type l -exec md5sum {} +

    This has form expression -o expression, with expressions:

    • -type f
    • -type l -exec md5sum {} +
      • (which is form: expression -a expression)

    Only the second produces output.

    find . -type f -o -type l

    This also has form expression -o expression, with expressions:

    • -type f
    • -type l

    Neither expression explicitly produces output. However, the final rule above applies, and the implied command is:

    find \( -type f -o -type l \) -print
    

    As @jqurious points out in the comments, to get your desired result, you should use parentheses:

    find \( -type f -o -type l \) -exec md5sum {} +