Search code examples
pythonopencvimage-processingalignmentregistration

Image registration issue using OpenCV&Python


I’m trying to register 2 images with some minor alignments and then check the difference between of them with the above code, But I’m getting strange aligned image although the detected keypoints match in these 2 images seems to be very good.

The images is:

Image 1: Image 1

Image 2: Image 2

The tif format makes the images too big to upload so I upload a screenshot of the images.

  • The white circle in the center is some elemet that I need to hide for some reasons, But still we can made the images registrations also only from the '15' pattern in the upper right area.

I also normalize image 2 to image 1 brightness with this code as a pre-process:

# Convert images to grayscale
img1 = cv2.cvtColor(self.img1, cv2.COLOR_BGR2GRAY)
img2 = cv2.cvtColor(self.img2, cv2.COLOR_BGR2GRAY)

# crop the images to ROI - The ROI is the same for both images and is the left bottom corner in size of 1/5 of the image
img1_roi = img1[0:int(img1.shape[0] / 5), 0:int(img1.shape[1] / 5)]
img2_roi = img2[0:int(img2.shape[0] / 5), 0:int(img2.shape[1] / 5)]

# Calculate the mean of the images.
mean_img1 = np.mean(img1_roi)
mean_img2 = np.mean(img2_roi)

# Calculate the ratio of the brightness of the images.
ratio = mean_img1 / mean_img2
print(f'Brightness ratio: {ratio}')

# Multiply the second image by the ratio.
self.img2 = self.img2 * ratio

# Convert the image to uint8 again.
self.img2 = np.clip(self.img2, 0, 255)
self.img2 = self.img2.astype(np.uint8)

And here is my main code:

image1 = cv2.cvtColor(self.img1, cv2.COLOR_BGR2GRAY)
image2 = cv2.cvtColor(self.img2, cv2.COLOR_BGR2GRAY)
height, width = image2.shape

sift = cv2.xfeatures2d.SIFT_create()

keypoints1, descriptors1 = sift.detectAndCompute(image1, None)
keypoints2, descriptors2 = sift.detectAndCompute(image2, None)

bf = cv2.BFMatcher()

matches = bf.knnMatch(descriptors1, descriptors2, k=2)

good_matches = []
for m, n in matches:
    if m.distance < 0.75 * n.distance:
        good_matches.append(m)
src_pts = np.float32([keypoints1[m.queryIdx].pt for m in good_matches]).reshape(-1, 1, 2)
dst_pts = np.float32([keypoints2[m.trainIdx].pt for m in good_matches]).reshape(-1, 1, 2)
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)

draw_params = dict(singlePointColor=None, flags=2)
img3 = cv2.drawMatches(image1, keypoints1, image2, keypoints2, good_matches, None, **draw_params)

Matches output:

Matches output

aligned_image = cv2.warpAffine(image2, M[0:2, :], (image1.shape[1], image1.shape[0]))

Aligned image:

Aligned image

difference = cv2.absdiff(image1, aligned_image)
threshold = 10  # Adjust this threshold as per your requirements
change_mask = cv2.threshold(difference, threshold, 255, cv2.THRESH_BINARY)[1]

Difference image:

Difference image

I also tried Fourie method on the ROI of '15' pattern:

def _correlate_images(self, image1, image2):
    # compute the correlation coefficient between two images
    correlation = signal.correlate2d(image1, image2, boundary='symm', mode='same')
    return correlation

def _compute_shift_distance(self, image1, image2):
    # compute the shift distance between two images
    correlation = self._correlate_images(image1, image2)
    (y, x) = np.unravel_index(correlation.argmax(), correlation.shape)
    (tH, tW) = image2.shape[:2]
    shift_distance = (x - tW // 2, y - tH // 2)
    return shift_distance

def _align_images_fourier_mellin(self, image1, image2):
    # align images using Fourier-Mellin transform

    # compute the Fourier Transform of both images, then compute the
    # magnitude spectrum
    fft1 = np.fft.fft2(image1)
    fft2 = np.fft.fft2(image2)
    magnitude_spectrum1 = 20 * np.log(np.abs(fft1))
    magnitude_spectrum2 = 20 * np.log(np.abs(fft2))

    # find the peak in the correlation map
    correlation = self._correlate_images(magnitude_spectrum1, magnitude_spectrum2)
    (x, y) = np.unravel_index(correlation.argmax(), correlation.shape)

    # compute the shift distance
    (delta_x, delta_y) = self._compute_shift_distance(image1, image2)

    # use the shift distance to translate the image
    M = np.float32([[1, 0, delta_x], [0, 1, delta_y]])
    shifted = cv2.warpAffine(image2, M, (image2.shape[1], image2.shape[0]))

    # return the aligned image
    return shifted


image1_roi = image1[int(height / 5.5):int(height / 2.5), int(2.5 * width / 4):width]
image2_roi = image2[int(height / 5.5):int(height / 2.5), int(2.5 * width / 4):width]
aligned_image = self._align_images_fourier_mellin(image1_roi, image2_roi)

And after few minutes of running I get this results:

aligned image by Fourie

aligned full image by Fourie

Fourie diffs

Any ideas?

Thanks a lot!


Solution

  • I think a good method for alignment of this type is the Fourier-Mellin transform (see for example this tutorial). Note that the code added by OP in response to my comment is not the Fourier-Mellin algorithm, it attempts to compute a translation through correlation of the magnitude of the FFT of the two images, which makes no sense).

    The DIPlib library has an implementation of the Fourier-Mellin algorithm (disclaimer: I'm an author of DIPlib, and I wrote this implementation):

    import diplib as dip
    import numpy as np
    
    # Load the two images
    img1 = dip.ImageRead("img1.jpg")
    img2 = dip.ImageRead("img2.jpg")
    
    # They're gray-scale images, even if the JPEG file has RGB values
    img1 = img1(1)  # just keep the green channel
    img2 = img2(1)
    
    # They need to be the same size
    out_size = np.minimum(img1.Sizes(), img2.Sizes())
    img1.Crop(out_size)
    img2.Crop(out_size)
    
    # Note that these JPEGs are not very big. If the original images are huge, you can
    # down-sample them first to improve the efficiency of the operations.
    
    # Apply Fourier-Mellin to transform one image to match the other
    out = dip.Image()
    matrix = dip.FourierMellinMatch2D(img1, img2, out=out, correlationMethod="don't normalize")
    
    dip.JoinChannels((img1, out)).Show()
    

    The matrix returned is a list of values useful in other functions of this library, but it is equivalent to this transformation matrix:

    / 0.9999 -0.0085 -12.576 \
    | 0.0085  0.9999   5.336 |
    \ 0.0     0.0      1.0   /
    

    That is, it finds only a tiny rotation and 1:1 scaling; the translation is the only relevant element.

    The aligned image overlaid on the input images looks like this:

    display of original img1 in red and out in green

    It's not a perfect alignment, the numbers match up perfectly, but the ring doesn't on the far side of the numbers. This indicates that a more complex transform is required to match up the two images. Likely the camera angle changed, or the object itself deformed.