Search code examples
pythonpython-3.ximage-processingcomputer-visionedge-detection

How to detect that PART of an image was shifted?


I have several photos of documents and I need to determine whether the documents have been changed. So this task consist of two parts: 1. Text shifting. Detect that special text was shifted:

enter image description here

Pay attention to inscription: TEST TEXT HERE has a gap (at the S letter).

  1. And the second task: similar shift of the background image like this: enter image description here

MY QUESTION: are some methods to detect the same shifts? They can be horizontal and vertical. I always know the test inscription (in the first task) and I always have ornament sample in good quality. And I have a set of photos which I need to check.

My workaround. 1. First idea: superimpose correct inscription at each photo and check pixels colours around it - if there are a lot of red pixels, there is a shift. So I can go for text but can't for image. 2. About image I thought about Fourier Transformations. If there is a gap at the image there is function jump at the same coordinate. But I don't know some realizations of this method.


First of all I know, that this type of questions is too broad for SO. How to detect a shift between images But I found similar already, so hope it will not be closed! Second remark - I'm opened for any algorithms - both classical and machine learning.


Solution

  • I have made a script the way I would approach the problem. It may not be the best approach but I hope this helps a bit or give you a new point of view on how proceede.

    First I would tranform the image to HSV colorspace because transforming to binary would not be the best approach because there are a lot noises surrounding the tekst. After converting you can extract the text with cv2.inRange and a threshold. Then I search for contours and draw them on a new blank mask.

    enter image description here enter image description here

    Next thing is to perform an opening to merge the nearby contours into one big contour. Opening is followed by dilation to remove the remaining character T in the upper left corner.

    enter image description here enter image description here

    Next I would search for contours again and draw a bounding rectangle. If the contour is a perfect square then you would not see the rectangle but if the contour is shifted it would make rectangle with two smaller rectangles (oposite color) inside as such:

    enter image description here

    Finnaly search for contours again with size for threshold and draw them on the image.

    RESULT:

    enter image description here

    CODE:

    # Import modules
    import cv2
    import numpy as np
    
    # Read the image and transform to HSV colorspace.
    img = cv2.imread('ID.png')
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    
    # Extract the red text.
    lower_red = np.array([150,150,50])
    upper_red = np.array([200,255,255])
    mask_red = cv2.inRange(hsv, lower_red, upper_red)
    
    # Search for contours on the mask.
    _, contours, hierarchy = cv2.findContours(mask_red,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
    
    # Mask for processing.
    mask = np.ones(img.shape, np.uint8)*255
    
    # Iterate through contours and draw them on mask.
    for cnt in contours:
        cv2.drawContours(mask, [cnt], -1, (0,0,0), -1)
    
    # Perform opening to unify contours.
    kernel = np.ones((15,15),np.uint8)
    opening = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
    
    # Perform dilation to remove some noises.
    kernel_d = np.ones((2,2),np.uint8)
    dilation = cv2.dilate(opening,kernel_d,iterations = 1)
    
    # Seraching for contours on the new mask.
    gray_op = cv2.cvtColor(dilation, cv2.COLOR_BGR2GRAY)
    _, threshold_op = cv2.threshold(gray_op, 150, 255, cv2.THRESH_BINARY_INV)
    _, contours_op, hierarchy_op = cv2.findContours(threshold_op, cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
    
    # Iterate through contours and draw a bounding rectangle.
    for cnt in contours_op:
        x,y,w,h = cv2.boundingRect(cnt)
        cv2.rectangle(threshold_op,(x,y),(x+w,y+h),(255,255,255),1)
    
    # Seraching for contours again on the new mask.
    _, contours_f, hierarchy_f = cv2.findContours(threshold_op, cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
    
    # Iterate through contours and add size for thresholding out the rest.
    for cnt in contours_f:
        size = cv2.contourArea(cnt)
        if size < 1000:
            x,y,w,h = cv2.boundingRect(cnt)
            cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),-1)
    
    # Display the result.   
    cv2.imshow('img', img)
    

    For the second image it will not work because the picture is different in complexity.

    On the second image I would try to dilate it so the only thing left would be the bottom 3 (or in the shifted case 4) lines and count the number of contours. If 4 contours are present then it is shifted.

    enter image description here

    Or a second approach for the second image. Split the contours to 3 seperate contours with cv2.reactangle() and calculate the minimum distance from them to the line you create. This way you can calculate the even if the split happened before the bottom lines shifted.

    CODE for second image:

    # Import modules
    import cv2
    import numpy as np
    import scipy
    from scipy import spatial
    
    # Read image
    img_original = cv2.imread('ID_sec4.png')
    img = img_original.copy()
    
    # Get height and weight of the image
    h1, w1, ch = img.shape
    
    # Draw line somewhere in the bottom of the image
    cv2.line(img, (10, h1-10), (w1-10, h1-10), (0,0,0), 3)
    
    # Search for contours and select the biggest one (the pattern)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    _, threshold = cv2.threshold(gray, 240, 255, cv2.THRESH_BINARY_INV)
    _, contours, hierarchy = cv2.findContours(threshold,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
    cnt_big = max(contours, key=cv2.contourArea)
    
    # Draw white rectangles to seperate the extreme left and extreme right side of the contour
    x, y, w, h = cv2.boundingRect(cnt_big)
    cv2.rectangle(img,(0,0),(x+20,y+h+20),(255,255,255),2)
    cv2.rectangle(img,(w1,0),(w, y+h+20),(255,255,255),2)
    
    # Search for contours again
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    _, thresh = cv2.threshold(gray, 240, 255, cv2.THRESH_BINARY_INV)
    _, cnts, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
    
    # Iterate over the list and calculate minimum distance from the line (line you drew)
    # and contours then make a bounding box if it fits the criteria
    for i in cnts:
        reshape1 = np.reshape(i, (-1,2))
        ref = max(cnts, key=lambda cnts: cv2.boundingRect(cnts)[1])
        reshape2 = np.reshape(ref, (-1,2))
        tree = spatial.cKDTree(reshape2)
        mindist, minid = tree.query(reshape1)
        distances = np.reshape(mindist, (-1,1))
        under_min = [m for m in distances if 1 < m < 70]
        if len(under_min) > 1:
            x, y, w, h = cv2.boundingRect(i)
            cv2.rectangle(img_original,(x-10,y-10),(x+w+10,y+h+10),(0,255,0),2)
    
    # Display the result    
    cv2.imshow('img', img_original)
    

    Result:

    enter image description here

    enter image description here

    enter image description here

    Hope it helps a bit. Cheers!