Search code examples
pythonlistfiledirectorypathlib

Generate string of only file names from directory/subdirectory in Python, no directory address


NOTE: SEE EDITED VERSION OF QUESTION AT THE BOTTOM RE USING pathlib

I would like to iterate through directory/subdirectories (Mac) and list all filenames as a string. I can do this fine but the string includes directory info, eg /Users/TK/Downloads/Temp/a_c/imgs_a/a1.tif

I just want the "a1.tif".

Here's my code


'''
    For the given path, get the List of all files in the directory tree
'''

import os
def getListOfFiles(dirName):
    # create a list of file and sub directories
    # names in the given directory
    listOfFile = os.listdir(dirName)
    allFiles = list()
    # Iterate over all the entries
    for entry in listOfFile:
        # Create full path
        fullPath = os.path.join(dirName, entry)
        # If entry is a directory then get the list of files in this directory
        if os.path.isdir(fullPath):
            allFiles = allFiles + getListOfFiles(fullPath)
        else:
            allFiles.append(fullPath)

    return allFiles

dirName = "/Users/TK/Downloads/Temp_Folder/a_c";
# Get the list of all files in directory tree at given path
listOfFiles = getListOfFiles(dirName)
file_string = str(sorted(listOfFiles))
print(file_string) 

How do I get rid of the directory info and just list the file name (without extension even better)

--CHANGE OF CODE AS PER BELOW SUGGESTIONS-- --IT WORKS WITH A FEW SMALL ISSUES--

from pathlib import Path

path = os.chdir("/Users/TK/Downloads/Temp_Folder/a_c")

path = Path.cwd()

files = []
for file in path.rglob('*'):  # loop recursively over all subdirectories
    files.append(file.name)

files = [file.stem for file in path.rglob('*')]

fileList = str(sorted(files))
print(fileList)

The result is ['.DS_Store', '.DS_Store', '.tif', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'b1', 'b2', 'b3', 'b4', 'b5', 'b6', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'imgs_a', 'imgs_b', 'imgs_c']

Almost perfect - can I get rid of everything but 'a1', 'a2'...'c6'

Also I couldn't place the dir into path = Path.cwd() which is why I used path = os.chdir("/Users/TK/Downloads/Temp_Folder/a_c")

-------------Edited Question------------

I like the idea of using pathlib as suggested below. From what I've researched online, it seems to be simplest version of code to get the job done and it should work? But somehow it's not giving me what I'm after.

pathlib code I've tried

from pathlib import Path
path = Path('/Users/TK/Downloads/Temp_Folder/a_c')
files = [file.stem for file in path.rglob('*')]
print(files)
from pathlib import Path
print(Path('/Users/TalaKaplinovsky/Downloads/Patrick_Strips_Temp_Folder/a_c')stem)

Both give me this Output identical for both versions: '/Users/TK/Downloads/Temp_Folder/a_c/.DS_Store', '/Users/TK/Downloads/Temp_Folder/a_c/imgs_b/b2.tif' '/Users/TK/Downloads/Temp_Folder/a_c/imgs_b/b3.tif', '/Users/TK/Downloads/Temp_Folder/a_c/imgs_b/b1.tif', '/Users/TK/Downloads/Temp_Folder/a_c/imgs_a/.tif', '/Users/TK/Downloads/Temp_Folder/a_c/imgs_a/a4.tif',

I just want 'b2', 'b3', 'b1', 'a4' and sorted in an order (a4, b1, etc)


Solution

  • You can do this fairly simple by using pathlib (comes bundled with python):

    from pathlib import Path
    
    path = Path.cwd()  # insert your path 
    
    files = []
    for file in path.rglob('*'):  # loop recursively over all subdirectories
        files.append(file.name)
    

    or, even simpler:

    files = [file.name for file in path.rglob('*')]
    

    To remove the extension, you can use Path.stem:

    files = [file.stem for file in path.rglob('*')]