can't do cosine similarity between 2 images, how to make both have the same length

I was doing some face verification stuff(I'm a newbie) first I vectorize the 2 pics I want to compare

filename1 = '/1_52381.jpg'
filename2 = '/1_443339.jpg'

img1 = plt.imread(filename1)
rows,cols,colors = img1.shape
img1_size = rows*cols*colors
img1_1D_vector = img1.reshape(img1_size).reshape(-1, 1)


img2 = plt.imread(filename2)
rows,cols,colors = img2.shape
img2_size = rows*cols*colors
img2_1D_vector = img2.reshape(img2_size).reshape(-1, 1)

(img2_1D_vector.shape,img1_1D_vector.shape)

and here I get the dim of the both vector which is: ((30960, 1), (55932, 1)).

My question is how to make them both of the same length do I need to reshape the picture to have the same size first? or I can do it after vectorize it? thanks for reading

Solution

Yes, to compute a cosine similarity you need your vectors to have the same dimension, and resizing one of the pictures before reshaping it into a vector is a good solution.

To resize, you can use one of image processing framework available in python.

Note: they are different algorithm/parameters that can be use for resizing.

# with skimage (you can also use PIL or cv2)
from skimage.transform import resize

shape = img1.shape
img2_resized = resize(img1, shape)

img1_vector = img1.ravel()
img2_vector = img2_resized.ravel()

# now you can perform your cosine similarity between img1_vector and img2_vector
# which are of the same dimension

you may wan't to downscale the bigger picture instead of upscaling the smaller one as upscaling may introduce more artefacts. You may also want to work with a fixed size accross a whole dataset.