I have a directory that has 8K files at the moment. Over the next month or so that number will grow to 100K plus. I need to be able to count the files in the directory matching a particular pattern. The pattern match adds an enormous amount of time:
[XXXXXX@login-0-0 scripts]$ time ls | grep . -c
8373
real 0m0.115s
user 0m0.109s
sys 0m0.009s
[XXXXXX@login-0-0 scripts]$ time ls *.o* | grep . -c
6262
real 0m1.997s
user 0m0.121s
sys 0m0.270s
As the number of files continues to increase the time to count by pattern will become too great. Can I get around this?
As a side note, the filesystem is Lustre, and I can deal with a non-portable solution.
Perhaps what's slowing you down is that your glob is actually expanded by bash and the parameters are passed to ls, so it's a bit of extra work. This works for me:
user@host:~/junk$ time find . | wc -l
188318
real 0m0.202s
user 0m0.076s
sys 0m0.136s
user@host:~/junk$ time find . -name '*.o' | wc -l
374
real 0m0.243s
user 0m0.160s
sys 0m0.080s
Not much difference in speed. Note that this is recursive, but you can set the maxdepth if you don't want that.
Maybe it's time for me to clean up some junk...