Search code examples
pythonimageopencvimage-processingopencv3.0

Manually writing code for warpAffine in python


I want to implement affine transformation by not using library functions. I have an image named "transformed" and I want to apply inverse transformation to obtain "img_org" image. Right now, I am using my own basic GetBilinearPixel function to set the intensity value. But, the image is not transforming properly.This is what I came up with. :

This is image("transformed.png"):

enter image description here

This is image("img_org.png"):

enter image description here

But My goal is to produce this image: enter image description here

You can see the transformation matrix here:

pts1 = np.float32( [[693,349] , [605,331] , [445,59]] )
pts2 = np.float32 ( [[1379,895] , [1213,970] ,[684,428]] )
Mat = cv2.getAffineTransform(pts2,pts1)
B=Mat

code:

img_org = np.zeros(shape=(780,1050))
img_size = np.zeros(shape=(780,1050))

def GetBilinearPixel(imArr, posX, posY):
    return imArr[posX][posY]

for i in range(1, img.shape[0]-1):
    for j in range(1, img.shape[1]-1):
        pos = np.array([[i], [j], [1]], np.float32)
        # print pos
        pos = np.matmul(B,pos)
        r = int(pos[0][0])
        c = int(pos[1][0])
        # print r,c
        if(c <= 1024 and r <= 768 and c >= 0 and r >= 0):
            img_size[r][c] = img_size[r][c]+1
            img_org[r][c] += GetBilinearPixel(img, i, j)
        
for i in range(0,img_org.shape[0]):
    for j in range(0,img_org.shape[1]):
        if(img_size[i][j]>0):
            img_org[i][j] = img_org[i][j]/img_size[i][j]

Is my logic wrong? I know that i have applied very inefficient algorithm. Is there any insight that i am missing? Or can you give me any other algorithm which will work fine.

(Request) . I don't want to use warpAffine function.


Solution

  • So I vectorized the code and this method works---I can't find the exact issue with your implementation, but maybe this will shed some light (plus the speed is way faster).

    The setup to vectorize is to create a linear (homogeneous) array containing every point in the image. We want an array that looks like

    x0 x1 ... xN   x0 x1 ... xN   .....   x0 x1 ... xN
    y0 y0 ... y0   y1 y1 ... y1   .....   yM yM ... yM
     1  1 ...  1    1  1 ...  1   .....    1  1 ...  1
    

    So that every point (xi, yi, 1) is included. Then transforming is just a single matrix multiplication with your transformation matrix and this array.

    To simplify matters (partially because your image naming conventions confused me), I'll say the original starting image is the "destination" or dst because we want to transform back to the "source" or src image. Bearing that in mind, creating this linear homogenous array could look something like this:

    dst = cv2.imread('img.jpg', 0)
    h, w = dst.shape[:2]
    dst_y, dst_x = np.indices((h, w))  # similar to meshgrid/mgrid
    dst_lin_homg_pts = np.stack((dst_x.ravel(), dst_y.ravel(), np.ones(dst_y.size)))
    

    Then, to transform the points, just create the transformation matrix and multiply. I'll round the transformed pixel locations because I'm using them as an index and not bothering with interpolation:

    src_pts = np.float32([[693, 349], [605, 331], [445, 59]])
    dst_pts = np.float32([[1379, 895], [1213, 970], [684, 428]])
    transf = cv2.getAffineTransform(dst_pts, src_pts)
    src_lin_pts = np.round(transf.dot(dst_lin_homg_pts)).astype(int)
    

    Now this transformation will send some pixels to negative indices, and if we index with those, it'll wrap around the image---probably not what we want to do. Of course in the OpenCV implementation, it just cuts those pixels off completely. But we can just shift all the transformed pixels so that all of the locations are positive and we don't cut off any (you can of course do whatever you want in this regard):

    min_x, min_y = np.amin(src_lin_pts, axis=1)
    src_lin_pts -= np.array([[min_x], [min_y]])
    

    Then we'll need to create the source image src which the transform maps into. I'll create it with a gray background so we can see the extent of the black from the dst image.

    trans_max_x, trans_max_y = np.amax(src_lin_pts, axis=1)
    src = np.ones((trans_max_y+1, trans_max_x+1), dtype=np.uint8)*127
    

    Now all we have to do is place some corresponding pixels from the destination image into the source image. Since I didn't cut off any of the pixels and there's the same number of pixels in both linear points array, I can just assign the transformed pixels the color they had in the original image.

    src[src_lin_pts[1], src_lin_pts[0]] = dst.ravel()
    

    Now, of course, this isn't interpolating on the image. But there's no built-ins in OpenCV for interpolation (there is backend C functions for other methods to use but not that you can access in Python AFAIK). But, you have the important parts---where the destination image gets mapped to, and the original image, so you can use any number of libraries to interpolate onto that grid. Or just implement a linear interpolation yourself as it's not too difficult. You'll probably want to un-round the warped pixel locations of course before then.

    cv2.imshow('src', src)
    cv2.waitKey()
    

    Source image

    Edit: Also this same method will work for warpPerspective too, although your resulting matrix multiplication will give a three-rowed (homogeneous) vector, and you'll need to divide the first two rows by the third row to set them back into Cartesian world. Other than that, everything else stays the same.