I would need to exclude a few directories or only scan some of them while using os.walk(). I am trying to get the most recent files. I learned how to do this from this post but it only return back one file. For my project I would need a list of 5 or more recent files. From this post it shows on how to scan a few dirs only but I have no idea on how to implement it in the first post answer.
I want to exclude the directory which is the recently modified file. If Folder 3
is the recently modified file then the next time i scan looking for the 2 or 3 or other i want to exclude that directory.
Here is my file layout:
MainFile(CurrentOne)
|
|-- Projects(the one I am scanning)
#the following folders all have images in them but they are created at the same time as the folder
|-- Folder 1
|
|-- Folder 2
|
|-- Folder 3
|
|-- etc...
My previous approach was:
I cant show the code as I have deleted that piece of code but I can explain it:
First: I would first get a list of the dirs in the folder using os.listdir(Projects)
Second: I would check to see if I have more than 5 or less than or equal to 5
Third: I would go into each folder(I had them put in a list in the first operation) and use stats = os.stat(dirname)
to get info about it.
Fourth: I put all of the info in a list using recent.insert(0, stats[8]
)
Lastly: I would compare all the times and get 5 of them but they are all incorrect.
Edit
Once I get the most recently modified file I would want to exclude that directory from being scanned or only scan the other directories. For example pretend folder 1 was recently modified and python displayed folder 1
. I then would want to exclude that directory while scanning for the second recently modified directory
After reading @tripleee is comment I have made this piece of code that gets most recently modified files.
import os
os.chdir('Folder')
projloc = os.getcwd() #getting the folder to scan
list_of_dirs_to_exclude = []
def get_recent_files():
max_mtime = 0
for root, dirs, files in os.walk(projloc):
if root not in list_of_dirs_to_exclude: # I have made a change by adding the `not` in unlike @tripleee's answer
for fname in files:
full_path = os.path.join(root, fname)
mtime = os.stat(full_path).st_mtime
if mtime > max_mtime:
max_mtime = mtime
max_dir = root
max_file = fname
list_of_dirs_to_exclude.insert(0, max_dir)
print(max_file)
if len(list_of_dirs_to_exclude) == 5: #You can keep whatever number you want such as 6, 7, 4 etc...
pass
else:
get_recent_files()
get_recent_files()
Here is updated code if you want the code all in the same def
def get_recent_files():
list_of_dirs_to_exclude = []
list_of_dirs = []
max_mtime = 0
for dirs in os.listdir(projloc): #projloc is predefined for me. I got it using the same method in the above code
list_of_dirs.insert(0, dirs)
while len(list_of_dirs) != 5:
for root, dirs, files in os.walk(projloc):
if root not in list_of_dirs_to_exclude:
for fname in files:
full_path = os.path.join(root, fname)
mtime = os.stat(full_path).st_mtime
if mtime > max_mtime:
max_mtime = mtime
max_dir = root
max_file = fname
list_of_dirs_to_exclude.insert(0, max_dir)
print(max_file)
max_mtime = 0
if len(list_of_dirs_to_exclude) == 5:
break