Search code examples
pythonloopsmkdirsubdirectory

loop through sub directories, to sample files


The following code selects a random sample of files (in this case 50) from dir 1 and copies them to a new folder with the same name.

However, I have hundreds of folders which I need to sample from (and copy to a new folder with the same name).

How can I adjust the first part of the code so that I can loop through all sub directories, and move the samples to a new folder with the same name. (so the sample of sub dir 1 goes to dir 1, the sample of sub dir 2 goes to dir 2 etc.)

import os 
import shutil 
import random 
from shutil import copyfile

sourcedir = '/home/mrman/dataset-python/train/1/'
newdir  = '/home/mrman/dataset-python/sub-train/1'


filenames = random.sample(os.listdir(sourcedir), 50)
for i in filenames:
    shutil.copy2(sourcedir + i, newdir)

Solution

  • Solution was simpler than expected (thanks to @idjaw for the tip):

    import os, sys
    import shutil
    import random
    from shutil import copyfile
    
    #folder which contains the sub directories
    source_dir = '/home/mrman/dataset-python/train/'
    
    #list sub directories 
    for root, dirs, files in os.walk(source_dir):
    
    #iterate through them
        for i in dirs: 
    
            #create a new folder with the name of the iterated sub dir
            path = '/home/mrman/dataset-python/sub-train/' + "%s/" % i
            os.makedirs(path)
    
            #take random sample, here 3 files per sub dir
            filenames = random.sample(os.listdir('/home/mrman/dataset-python/train/' + "%s/" % i ), 3)
    
            #copy the files to the new destination
            for j in filenames:
                shutil.copy2('/home/mrman/dataset-python/train/' + "%s/" % i  + j, path)