I want to loop through files with extension ends with CR2, CR3, cr2, cr3 (contain cr in the extensions) only. Currently, I am using os.walk() but people recommend to use pathlib which I can do something like path.glob('*.jpg')
but still I cannot specify the desired condition. Is there a better way to do this?
for root, dirs, files in os.walk(cfg.RAWIMG_DIR):
dirs.sort()
for file in files:
if dir > '10':
path = Path(root) / file
if 'cr' in path.suffix.lower():
Additionally, since there are too many files, I would to process portion by portion (Say 10 files at a time) during the loop. This is why I need the list of dir names as well.
using pathlib
you could do this:
from pathlib import Path
path = Path(cfg.RAWIMG_DIR)
for file in path.glob("**/*.[Cc][Rr][23]"):
print(file)
the **
part of the glob will match all the subdirs of path
.
Note that this glob also matches *.cR3
and *.Cr2
etc (mixed-case). not sure if you want that.
if you want to match the suffixes exactly you could to this:
from pathlib import Path
path = Path(cfg.RAWIMG_DIR)
suffixes = {".cr2", ".cr3", ".CR2", ".CR3"}
for file in path.glob("**/*.*"):
if file.suffix in suffixes:
print(file)
note that both versions print the full path of the files.
and both versions are lazy. you could turn any one of them into a generator and use batched
(from the Itertools Recipes) in order to get batches of files to work on.