I am using the md5deep utility to compute the hashes for files while recursively digging through a directory structure.
It allows to run command like this -
md5deep -r -l -j0 app
and gives output like this (recursive list of md5 hash of all the underlying files/directories, considering their content) -
d41d8cd98f00b204e9800998ecf8427e app/tests/cases/controllers/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/cases/models/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/cases/components/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/cases/helpers/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/cases/behaviors/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/groups/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/fixtures/empty
I am further doing an md5sum
on the result to produce a hash of the entire codebase -
md5deep -r -l -j0 app | md5sum
Output -
86df91fc29f2891ff0aa7aaa4bd13730 -
Now, I am stuck at excluding some paths (files and directories) from being considered while calculating the final md5sum. E.g. if I want to exclude these two paths - app/tests/groups/empty
and app/tests/fixtures/empty
.
The md5deep documentation provides an option (-f
option) to provide a list of file names/directories in a file, but those files will be included. However, I am looking for the opposite, i.e. to exclude some predefined set of files/directories from the dynamic set of directories (new directories/files could be added in future) inside a given directory.
Solutions using regular expressions or some tool/utility other than md5deep are also welcome, as long as it serves my purpose. I feel a regex solution with grep would be complicated, in the absence of lookaheads. E.g. the following regex is needed just to match any string excluding ABC
-
^([^A]|A([^B]|B([^C]|$)|$)|$).*$
Why not using find
together with md5sum
:
find app -type f -exec md5sum {} \;
d41d8cd98f00b204e9800998ecf8427e app/tests/groups/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/cases/components/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/cases/behaviors/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/cases/models/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/cases/helpers/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/cases/controllers/empty
d41d8cd98f00b204e9800998ecf8427e app/tests/fixtures/empty
If you need to exclude some directory, use the option -path
and if you need to exclude filename use -name
.
For example if you want to exclude file which would contain models
in their pathname, use the following:
find app -type f ! -path "*models*" -exec md5sum {} \;
BTW, if your looking at empty files, you can use the -empty
option:
find app -empty