Search code examples
pythonnumpyopencvimage-processingimage-manipulation

cv2.warpPerspective just a little bit off?


I'm rather new to OpenCV and image processing in general. There are a lot of tutorials on warping an image using points to make it "top down". For my project, I have an image of some computer monitors, and I'm trying to make it appear that an image I select is actually displaying on one of the monitors. It almost works, but the transform seems to be a little bit off.

I double-checked the points to make sure they're correct, so I think it has to be something to do with my matrix. Here is my Python code:

def normalizeImage(image, monitorID, monitorPts, monitors):
    # This function will resize any image to the correct 
    # height and width to prepare it for being reshaped
    rows, cols, _ = image.shape
    pts1 = np.array([[0,0],[0,cols],[rows,cols],[rows,0]], dtype=np.float32)
    pts2 = monitorPts[monitorID]
    matrix = cv2.getPerspectiveTransform(pts1, pts2) # I think my issue is here!!<<<<<<<<<<
    # matrix equals:
    # | .497 .072  27|
    # |-.045 .271 204|
    # |    0    0   1|
    dst = cv2.warpPerspective(image, matrix, (monitors.shape[1],monitors.shape[0]))
    cv2.imshow("dst", dst)
    return dst

    
def combineImages(monitors, image): # display shape of all three images for comparison
    dst = cv2.add(image, monitors)
    return dst


def main():
    monitors = cv2.imread("images/monitors.jpg")
    image0 = cv2.imread("images/goldglasses.png")
    monitorID = 0 # Which monitor are we adding an image to?
    # monitors.shape should be (694, 1431, 3)
    print("Monitors:")
    print(monitors.shape)
    # Point locations for all three monitors. We're only worried about number 0 for now.
    monitorPts = (np.array([[27,204],[84,411],[418,301],[393,129]], dtype=np.float32),
                 np.array([[436,96],[453,325],[847,319],[850,93]], dtype=np.float32),
                 np.array([[911,93],[878,318],[1259,426],[1327,150]], dtype=np.float32))
    image0 = normalizeImage(image0, monitorID, monitorPts, monitors)
    monitors = combineImages(monitors, image0)
    cv2.imshow("monitors", monitors)
    cv2.waitKey(0)

if __name__ == "__main__":
    # execute only if run as a script
    main()

I think it may have to do with the order in which the matrix transformations are completed, but I'm not sure. At the very least, I'm 99% sure the issue is in the "normalizeImage" function.

Here are the two source images:

goldglasses.png:
enter image description here

monitors.jpg:
enter image description here

In addition, here's the result I am currently getting:

newMonitors.png:
enter image description here

Notice that the guy doesn't quite fit in the monitor. Any help is appreciated. If you have any clarifying questions, feel free to ask. Any alternatives to my method that might be easier are also very appreciated. I have a really hard time understanding a lot of the documentation on some of these functions, so I wouldn't be surprised if there was a better way of doing all this.


Solution

  • It looks like you switched between rows and cols in coordinates of pts1.

    • cols applies x axis.
    • rows applies y axis.

    Replace:

    pts1 = np.array([[0,0],[0,cols],[rows,cols],[rows,0]], dtype=np.float32)
    

    With:

    pts1 = np.array([[0,0],[0,rows],[cols,rows],[cols,0]], dtype=np.float32)
    

    To be even more accurate subtract 1:

    pts1 = np.array([[0,0],[0,rows-1],[cols-1,rows-1],[cols-1,0]], dtype=np.float32)
    

    Result:
    enter image description here