Search code examples
pythonopencvhomography

Opencv homography to find global xy coordinates from pixel xy coordinates


I am trying to find the transformation matrix H so that i can multiply the (x,y) pixel coordinates and get the (x,y) real world coordinates. Here is my code:

import cv2
import numpy as np
from numpy.linalg import inv
if __name__ == '__main__' :
D=[159.1,34.2]
I=[497.3,37.5]
G=[639.3,479.7]
A=[0,478.2]
# Read source image.
im_src = cv2.imread('/home/vivek/june_14.png')
# Four corners of the book in source image
pts_src = np.array([D,I,G,A])

# Read destination image.
im_dst = cv2.imread('/home/vivek/june_14.png')

# Four corners of the book in destination image.
print "img1 shape:",im_dst.shape
scale=1
O=[0.0,0.0]
X=[134.0*scale,0]
Y=[0.0,184.0*scale]
P=[134.0*scale,184.0*scale]
# lx = 75.5 * scale
# ly = 154.0 * scale
pts_dst = np.array([O,X,P,Y])

# Calculate Homography
h, status = cv2.findHomography(pts_src, pts_dst)

print "homography:",h
print "inv of H:",inv(h)
print "position of the blob on the ground xy plane:",np.dot(np.dot(h,np.array([[323.0],[120.0],[1.0]])),scale)


# Warp source image to destination based on homography

im_out = cv2.warpPerspective(im_src, h, (im_dst.shape[1],im_dst.shape[0]))

# Display images
cv2.imshow("Source Image", im_src)
cv2.imshow("Destination Image", im_dst)
cv2.imshow("Warped Source Image", im_out)
cv2.imwrite("im_out.jpg", im_out)
cv2.waitKey(0)

The global xy's i am getting are very off. Am i doing something wrong somewhere?


Solution

  • The long answer

    Homographies are 3x3 matrices and points are just pairs, 2x1, so there's no way to map these together. Instead, homogeneous coordinates are used, giving 3x1 vectors to multiply. However, homogeneous points can be scaled while representing the same point; that is, in homogeneous coordinates, (kx, ky, k) is the same point as (x, y, 1). From the Wikipedia page on homogeneous coordinates:

    Given a point (x, y) on the Euclidean plane, for any non-zero real number Z, the triple (xZ, yZ, Z) is called a set of homogeneous coordinates for the point. By this definition, multiplying the three homogeneous coordinates by a common, non-zero factor gives a new set of homogeneous coordinates for the same point. In particular, (x, y, 1) is such a system of homogeneous coordinates for the point (x, y). For example, the Cartesian point (1, 2) can be represented in homogeneous coordinates as (1, 2, 1) or (2, 4, 2). The original Cartesian coordinates are recovered by dividing the first two positions by the third. Thus unlike Cartesian coordinates, a single point can be represented by infinitely many homogeneous coordinates.

    Obviously, in cartesian coordinates, this scaling does not hold; (x, y) is not the same point as (xZ, yZ) unless Z = 0 or Z = 1. So we need a way to map these homogeneous coordinates, which can be represented an infinite number of ways, down to Cartesian coordinates, which can only be represented one way. Luckily this is very easy, just scale the homogeneous coordinates so the last number in the triple is 1.

    Homographies multiply homogeneous coordinates and return homogeneous coordinates. So in order to map them back to Cartesian world, you just need to divide by the last coordinate to scale them and then rip the first two numbers out.

    The short answer

    When you multiply homogeneous coordinates by a homography, you need to scale them:

    sx'       x
    sy' = H * y
    s         1
    

    So to get back to Cartesian coordinates, divide the new homogeneous coordinates by s: (sx', sy', s)/s = (x', y', 1) and then (x', y') are the points you want.

    The shorter answer

    Use the built-in OpenCV function convertPointsFromHomogeneous() to convert your points from homogeneous 3-vectors to Cartesian 2-vectors.