Search code examples
pythonfile

How to determine if a path is a subdirectory of another?


I am given a list of paths that I need to check files within. Of course, if I am given a root, and a subdirectory, there is no need to process the sub-directory. For example

c:\test  // process this
c:\test\pics // do not process this
c:\test2 // process this

How can I tell (cross platform) that a path is not a subdirectory of the other. Preferably I would want this to be cross platform, and am not worried about symlinks as long as they are not cyclical (worse case is that I end up processing the data twice).


Solution

  • I would maintain a set of directories you have already processed, and then for each new path check to see if any of its parent directories already exist in that set before processing:

    import os.path
    
    visited = set()
    for path in path_list:
        head, tail = os.path.split(path)
        while head and tail:
            if head in visited:
                break
            head, tail = os.path.split(head)
        else:
            process(path)
            visited.add(path)
    

    Note that path_list should be sorted so that subdirectories are always after their parent directories if they exist.