How to shuffle two numpy arrays, such that the correspondence between shuffled arrays is the same as the original arrays

So I have two np arrays: namely Labels and Images

The shape of Labels is (3000,1), and the shape of Images is (3000, 226,226,3). The ith label corresponds to the ith image. Now I want to shuffle my dataset such that the correspondence is preserved. Is there a python library or function for this task?

One idea I got was to concatenate the labels and images into one array, and then shuffle the resulting matrix by axis = 0. However, np.concatenate() did not allow it because the shape of the arrays is not the same.

Solution

You can do it in at least 2 ways.

You can shuffle indices and use advanced indexing (sub optimal with memory)

indices = np.arange(len(images))
np.random.shuffle(indices)
images = images[indices]
labels = labels[indices]

But I think the best is resetting the seed before each shuffle to allow in place shuffle:

# make sure images and labels are shuffled in place and in unison
s = 42 # or whatever
np.random.seed(s)
np.random.shuffle(images)
np.random.seed(s)
np.random.shuffle(labels)