I want to use pathlib.glob()
to find directories with a specific name pattern (*data
) in the current working dir. I don't want to explicitly check via .isdir()
or something else.
This is the relevant listing with three folders as the expected result and one file with the same pattern but that should be part of the result.
ls -ld *data
drwxr-xr-x 2 user user 4,0K 9. Sep 10:22 2021-02-11_68923_data/
drwxr-xr-x 2 user user 4,0K 9. Sep 10:22 2021-04-03_38923_data/
drwxr-xr-x 2 user user 4,0K 9. Sep 10:22 2022-01-03_38923_data/
-rw-r--r-- 1 user user 0 9. Sep 10:24 2011-12-43_3423_data
[
'2021-02-11_68923_data/',
'2021-04-03_38923_data/',
'2022-01-03_38923_data/'
]
from pathlib import Path
cwd = Path.cwd()
result = cwd.glob('*_data/')
result = list(result)
That gives me the 3 folders but also the file.
Also tried the variant cwd.glob('**/*_data/')
.
The trailing path separator certainly should be respected in pathlib.glob
patterns. This is the expected behaviour in shells on all platforms, and is also how the glob module works:
If the pattern is followed by an os.sep or os.altsep then files will not match.
However, there is a bug in pathlib that was fixed in bpo-22276, and merged in Python-3.11.0rc1 (see what's new: pathlib).
In the meantime, as a work-around you can use the glob module to get the behaviour you want:
$ ls -ld *data
drwxr-xr-x 2 user user 4096 Sep 9 22:45 2022-01-03_38923_data
drwxr-xr-x 2 user user 4096 Sep 9 22:44 2021-04-03_38923_data
drwxr-xr-x 2 user user 4096 Sep 9 22:44 2021-02-11_68923_data
-rw-r--r-- 1 user user 0 Sep 9 22:45 2011-12-43_3423_data
>>> import glob
>>> res = glob.glob('*_data')
>>> print('\n'.join(res))
2022-01-03_38923_data
2011-12-43_3423_data
2021-02-11_68923_data
2021-04-03_38923_data
>>> res = glob.glob('*_data/')
>>> print('\n'.join(res))
2022-01-03_38923_data/
2021-02-11_68923_data/
2021-04-03_38923_data/