I have a directory with ~8000 files of the form
output/Manuscript_00750_AnimalGiants-compact.json
output/Manuscript_00750_AnimalGiants-expanded.json
output/Manuscript_00750_AnimalGiants.json
output/Manuscript_00752_AnimalGiants-compact.json
output/Manuscript_00752_AnimalGiants-expanded.json
output/Manuscript_00752_AnimalGiants.json
output/Unit_TZH_12345_Foo-compact.json
output/Unit_TZH_12345_Foo-expanded.json
output/Unit_TZH_12345_Foo.json
I need to come up with a regex to work with the find
tool to select just the Manuscript-compact ones:
output/Manuscript_00750_AnimalGiants-compact.json
output/Manuscript_00752_AnimalGiants-compact.json
Coming up with the regex is the easy part, but getting find
to cooperate is the hard part.
Here's my regex:
/Manuscript[0-9_a-zA-Z]+-compact\.json/
Here are some of the commands I've tried; all produce zero results. The cwd is the directory above output/
:
find output -regex "Manuscript[0-9_a-zA-Z]+-compact\.json"
find output -regex "\./output/Manuscript[0-9_a-zA-Z]+-compact\.json/"
find output -regex ".*\Manuscript[0-9_a-zA-Z]+-compact.*\json"
But this command does produce results - it selects all the files that start with "Manuscript", which is obviously too broad:
find output -regex ".*\Manuscript.*\json"
What's the correct regex format for find
here?
On OSX you can use this find
with extended regex:
find -E output -regex '.*/Manuscript[0-9_a-zA-Z]+-compact\.json$'
On gnu find
use:
find output -regextype posix-extended -regex '.*/Manuscript[0-9_a-zA-Z]+-compact\.json$'