So I have two np arrays: namely Labels
and Images
The shape of Labels
is (3000,1), and the shape of Images
is (3000, 226,226,3). The ith label corresponds to the ith image. Now I want to shuffle my dataset such that the correspondence is preserved. Is there a python library or function for this task?
One idea I got was to concatenate the labels and images into one array, and then shuffle the resulting matrix by axis = 0. However, np.concatenate()
did not allow it because the shape of the arrays is not the same.
You can do it in at least 2 ways.
You can shuffle indices and use advanced indexing (sub optimal with memory)
indices = np.arange(len(images))
np.random.shuffle(indices)
images = images[indices]
labels = labels[indices]
But I think the best is resetting the seed before each shuffle to allow in place shuffle:
# make sure images and labels are shuffled in place and in unison
s = 42 # or whatever
np.random.seed(s)
np.random.shuffle(images)
np.random.seed(s)
np.random.shuffle(labels)