I want to use glob
function in order to find files located in folders corresponding to two different type of folder names.
The solution I found would be simply:
import glob
files1 = glob.glob('*type1*/*')
files2 = glob.glob('*type2*/*')
files = files1 + files2
Is there any way of rewritting this using only one glob? If yes, would it be faster?
Something like
files = glob.glob('*[type1, type2]*/*')
glob
understands shell-style path globbing, so you can simply do:
files1 = glob.glob('*type[12]*/*')
or if you needed to expand to more numbers, something like this (for 1 through 6):
files1 = glob.glob('*type[1-6]*/*')
It will be faster to only call glob()
once, because glob()
will have to make multiple reads of the current directory and each subdirectory of the current directory (on a Unix system, this is readdir()
function) and those will be repeated for each call to glob()
. The directory contents might be cached by the OS, so it doesn't have to be read from disk, but the call still has to be repeated and glob()
has to compare all of the filenames against the glob pattern.
That said, practically speaking, the performance difference isn't likely to be noticeable unless you have thousands of files and subdirectories.