I want to randomly choose 10 images from the training dataset to be the test data. If I only copy the selected data to the destination path, it works. But if I want to remove the source data, it can only remove some of them. I tried both os.remove() and shutil.move() function, but the issue remains. The below is my script:
for label in labels:
training_data_path_ch1 = os.path.join(training_data_folder, label, 'ch1')
test_data_path_ch1 = os.path.join(test_data_folder, label, 'ch1')
training_data_path_ch5 = os.path.join(training_data_folder, label, 'ch5')
test_data_path_ch5 = os.path.join(test_data_folder, label, 'ch5')
ch1_imgs = listdir(training_data_path_ch1)
# Randomly select 10 images
ch1_mask = np.random.choice(len(ch1_imgs), 10)
ch1_selected_imgs = [ch1_imgs[i] for i in ch1_mask]
for selected_img in ch1_selected_imgs:
ch1_img_path = os.path.join(training_data_path_ch1, selected_img)
shutil.copy2(ch1_img_path, test_data_path_ch1)
os.remove(ch1_img_path)
print('Successfully move ' + label + ' ch1 images')
And I add an image to show the running status.
You can see, the program indeed can copy the images and remove some of the images, but why it cannot remove all images?
Any ideas? I appreciate any helps!
In:
ch1_mask = np.random.choice(len(ch1_imgs), 10)
You're potentially getting the same index returned more than once which means you're then trying to copy a file you've already copied and deleted (so you can't copy it again as it's removed), instead pass replace=False
, eg:
ch1_mask = np.random.choice(len(ch1_imgs), 10, replace=False)