Search code examples
unixfilesystemsls

Fast way to get number of files matching a pattern


I have a directory that has 8K files at the moment. Over the next month or so that number will grow to 100K plus. I need to be able to count the files in the directory matching a particular pattern. The pattern match adds an enormous amount of time:

[XXXXXX@login-0-0 scripts]$ time ls | grep . -c
8373

real    0m0.115s
user    0m0.109s
sys 0m0.009s
[XXXXXX@login-0-0 scripts]$ time ls *.o* | grep . -c
6262

real    0m1.997s
user    0m0.121s
sys 0m0.270s

As the number of files continues to increase the time to count by pattern will become too great. Can I get around this?

As a side note, the filesystem is Lustre, and I can deal with a non-portable solution.


Solution

  • Perhaps what's slowing you down is that your glob is actually expanded by bash and the parameters are passed to ls, so it's a bit of extra work. This works for me:

    user@host:~/junk$ time find . | wc -l
    188318
    
    real    0m0.202s
    user    0m0.076s
    sys 0m0.136s
    
    user@host:~/junk$ time find . -name '*.o' | wc -l
    374
    
    real    0m0.243s
    user    0m0.160s
    sys 0m0.080s
    

    Not much difference in speed. Note that this is recursive, but you can set the maxdepth if you don't want that.

    Maybe it's time for me to clean up some junk...