Optimizing Algorithm for Image Translation with maximum common foreground pixel

I'm searching for an algorithm to efficiently determine the optimal translation of one 2D image onto another (limited to x and y axes, without rotation). The goal is to find the maximum common pixel, where pixels are considered black on a white background and vice versa. The foreground-to-background ratio is significantly skewed towards the background.


enter image description here

enter image description here

Here, the best translation is [-171, 97].

I've already implemented an algorithm that compares all pixels and iterates through translations by incrementing x and y. However, this approach is time-consuming. To address this, I've attempted to speed up the process by focusing only on the translation of white pixels to other white pixels in the second image. Although it works, it's still slow.

Below is the code snippet of my current algorithm (arrayImage is only contains 0 and 1, it's a binary image).

public static int[] findBestTranslation(int[][] arrayImage1, int[][] arrayImage2) {
    int[] bestTranslation = new int[2];
    int bestSimilarity = Integer.MIN_VALUE;

    int img2Length = arrayImage2.length;
    int img2Width = arrayImage2[0].length;
    List<int[]> whitePixelsImage1 = findWhitePixels(arrayImage1);
    List<int[]> whitePixelsImage2 = findWhitePixels(arrayImage2);
    boolean[][] checkedOffsets = new boolean[img2Length * 2][img2Width * 2];
    int i = 0;
    for (int[] pixel1 : whitePixelsImage1) {
        System.out.println(i++ + ": " + bestSimilarity);
        for (int[] pixel2 : whitePixelsImage2) {
            //calculate translation
            int xOffset = pixel2[0] - pixel1[0];
            int yOffset = pixel2[1] - pixel1[1];

            // Check if this offset has been selected before
            if (checkedOffsets[xOffset + img2Length][yOffset + img2Width]) {
            } else {
                checkedOffsets[xOffset + img2Length][yOffset + img2Width] = true;

            int similarity = 0;
            for (int[] pixelNotTranslated : whitePixelsImage1) {
                int xTranslated = pixelNotTranslated[0] + xOffset;
                int yTranslated = pixelNotTranslated[1] + yOffset;
                if (xTranslated >= 0 && xTranslated < img2Length && yTranslated >= 0 && yTranslated < img2Width) {
                    similarity += arrayImage2[xTranslated][yTranslated];

            if (similarity > bestSimilarity) {
                bestSimilarity = similarity;
                bestTranslation[0] = xOffset;
                bestTranslation[1] = yOffset;

    return bestTranslation;


  • I don't speak Java, but I think you want a "phase correlation", which you can do with scikit-image in Python as follows:

    #!/usr/bin/env python3
    import numpy as np
    from skimage import io
    from skimage.registration import phase_cross_correlation
    # Define image names and load them
    im1name = 'MVlJh.png'
    im2name = 'XSj05.png'
    im1 = io.imread(im1name)
    im2 = io.imread(im2name)
    # Peform phase correlation
    shift, error, diffphase = phase_cross_correlation(im1, im2)
    print(f'Detected pixel offset (y, x): {shift}')


    For your two images, it outputs the following in around 1 second:

    Detected pixel offset (y, x): [-97. 171.   0.]

    I guess you could either "shell out" to Python with some type of exec() or system() function, or find a Java implementation of the phase correlation technique.

    There is also an OpenCV implementation available, which may have Java bindings. It is noticeably faster than the method above and gets the result in well under 1 second:

    #!/usr/bin/env python3
    import numpy as np
    import cv2 as cv
    im1name = 'MVlJh.png'
    im2name = 'XSj05.png'
    im1 = cv.imread(im1name, cv.IMREAD_GRAYSCALE)
    im2 = cv.imread(im2name, cv.IMREAD_GRAYSCALE)
    # Peform phase correlation
    retval, response = cv.phaseCorrelate(im1.astype(float), im2.astype(float))


    retval=(-171.0011860055988, 97.00465314052326)

    Here's what I get if I plot one image in green, then the second image in red shifted by the calculated amount.

    enter image description here

    I used a "hardmix" blending mode, so the yellow parts are where red and green coincide. For my own reference, this was my blend command with ImageMagick:

    magick MVlJh.png -fill lime -opaque white \( XSj05.png -geometry +171-97  -fill red -opaque white  \) -compose hardmix -composite result.png