Search code examples
pythonglob

Using glob to recursively get terminal subdirectories


I have a series of subdirectories with files in them:

/cars/ford/escape/sedan/
/cars/ford/escape/coupe/
/cars/ford/edge/sedan/
/cars/ferrari/testarossa/
/cars/kia/soul/coupe/

etcetera.

I'd like to get all of these terminal subdirectory paths from the root, /cars/, using glob (within Python), but not include any of the files within them, nor any parents of a subdirectory. Each one contains only files, no further subdirectories.

I tried using glob("**/"), but that also returns /cars/ford/, /cars/ford/escape/, /cars/ford/edge, /cars/ferrari/, etc. I do not want these.

I also tried using rglob("*/") but that also returns all files inside the terminal subdirectories.

I can get what I need by just globbing the files and making a set out of their parents, but I feel like there must be an elegant solution to this from the glob side of things. Unfortunately I can't seem to find the proper search terms to discover it. Thanks!


Solution

  • glob is the wrong tool for this job: traditional POSIX-y glob expressions don't support any kind of negative assertion (extglobs do, but it's still a restrictive kind of support -- making assertions about an individual name, not what does or doesn't exist on the same filesystem -- that doesn't apply to your use case, and Python doesn't support them anyhow). os.walk() and its newer children are better suited.

    Assuming you're on a new enough Python to support pathlib.Path.walk():

    import pathlib
    
    def terminal_dirs(parent):
        for root, dirs, files in pathlib.Path(parent).walk():
            if not dirs:
                yield root
    

    For older versions of Python, os.walk() can be used similarly:

    import os
    
    def terminal_dirs(parent):
        for dirpath, dirnames, filenames in os.walk(parent):
            if not dirnames:
                yield dirpath
    

    Both of these can of course be collapsed to one-liners if in a rush:

    result = [ r for (r,d,f) in os.walk('/cars') if not d ]