I needed a way to pull 10% of the files in a folder, at random, for sampling after every "run." Luckily, my current files are numbered numerically, and sequentially. So my current method is to list file names, parse the numerical portion, pull max and min values, count the number of files and multiply by .1, then use random.sample
to get a "random [10%] sample." I also write these names to a .txt then use shutil.copy
to move the actual files.
Obviously, this does not work if I have an outlier, i.e. if I have a file 345.txt
among other files from 513.txt - 678.txt
. I was wondering if there was a direct way to simply pull a number of files from a folder, randomly? I have looked it up and cannot find a better method.
Thanks.
I was unable to get the other methods to work easily with my code, but I came up with this.
output_folder = 'C:/path/to/folder'
for x in range(int(len(files) *.1)):
to_copy = choice(files)
shutil.copy(os.path.join(subdir, to_copy), output_folder)