Search code examples
pythondirname

Assistance refining search results from if loop?


I am looking for a little assistance to get the desired output from my loop.

I am trying to compile a list of paths leading up to any folder named 'published'. It nearly works but I'd appreciate if someone could show me a way to stop the loop from spitting out any directories that are children of 'published'.

import os

file = open('published_directories_list.txt', 'w');
top = "T:\PROJECTS";

for root, dirs, files in os.walk(top, topdown=False):
    for name in dirs:
        myPath = os.path.join(root, name);

        if 'published' in myPath:
            print (myPath);
            file.write(myPath + '\n');
        else:
            print (myPath + ' - no published directory!');

file.close();

print('DONE!');

Solution

  • What's happening is that os.walk iterates over every directory under top. So, if you have a directory structure like:

    top
      |
      - published
      |  |
      |  - something
      |
      - other
    

    at some point in your loop your line:

    myPath = os.path.join(root, name)
    

    will be joining a root of /top/published and a name of something. Obviously, then, when you check if "published" is in myPath it will be. Even though you're looking at a sub-directory of published, you're still seeing the name "published" in your path.

    An easy way to fix this issue would be to check, if myPath ends in "published" (using the endswith string method) instead of checking if it simply contains it. You could modify your if statement to read:

    if myPath.endswith('/published')
    

    Note that I included a backslash at the start of what we're checking for. This should fix DSM's point that we don't want to match "unpublished" as well.