Search code examples
pythonopencvimage-processingscikit-image

How to locate and extract a maze from a photo without being sensitive to warp or light


I have been asking several questions for locating and extracting maze from photos on SOF, but none of the answers I get work across different photos, not even across 4 testing photos. Every time when I tweaked the code to make it work for 1 photo, it will fail on the rest of photos due to warped corners/parts or light etc. I feel that I need to find a way which is insensitive to warped image and different intensity of light or the different colors of maze walls(the lines inside a maze).

I have been trying to make it work for 3 weeks without a luck. Before I drop the idea, I would like to ask is it possible to just use Image Processing without AI to locate and extract a maze from a photo? If yes, could you please show me how to do it?

Here are the code and photos:

import cv2    
import numpy as np

from skimage.exposure import rescale_intensity
from skimage.feature import corner_harris, corner_subpix, corner_peaks
from skimage.io import imread, imshow
from skimage.morphology import reconstruction, binary_erosion, skeletonize, dilation, square
from skimage.morphology.convex_hull import convex_hull_image
from skimage.util import invert
from skmpe import parameters, mpe, OdeSolverMethod

maze=cv2.imread("simple.jpg",0)
ret, maze=cv2.threshold(maze,100,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
h, w = maze.shape
seed = np.zeros_like(maze)
size = 40
hh = h // 2
hw = w // 2
seed[hh-size:hh+size, hw-size:hw+size] = maze[hh-size:hh+size, hw-size:hw+size]
rec1 = reconstruction(seed, maze)
seed2 = np.ones_like(rec1)
ker = np.ones((2,2))
rec1_thicker = cv2.erode(rec1, ker, iterations=1)    

seed2 = seed2 * 255
size2 = 240
lhh = hh - size2
hhh = hh + size2
lhw = hw - size2
hhw = hw + size2
seed2[lhh:hhh, lhw:hhw]=rec1_thicker[lhh:hhh, lhw:hhw]
rec2 = reconstruction(seed2,rec1_thicker, method='erosion')
rec2_inv = invert(rec2 / 255.)
hull = convex_hull_image(rec2_inv)
hull_eroded = binary_erosion(hull, selem=np.ones((5,5)))
coords = corner_peaks(corner_harris(hull_eroded), min_distance=5, threshold_rel=0.02)

import matplotlib.pyplot as plt
fig, axe = plt.subplots(1,4,figsize=(16,8))
axe[0].imshow(maze, 'gray')
axe[1].imshow(rec1, 'gray')
axe[2].imshow(rec2, 'gray')
axe[3].imshow(hull, 'gray')

Here is the output image: enter image description here

As you can see that the 3rd plot is the extracted maze, this piece of code works well, but just for this 2 photos, in this case they are simple.jpg and 'maze.jpg'...

if you tried the `hard.jpg' then it looks like this: enter image description here

and it also fails on the middle.jpg: enter image description here

I have uploaded all 4 testing photos to OneDrive for anyone who is interested to try them out.


Update 1

I plotted all masks to see what each one of them does.

mask = (sat < 16).astype(np.uint8) * 255
mask1 = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, cv2.getStructuringElement(cv2.MORPH_RECT, (31, 31)))
mask2 = cv2.copyMakeBorder(mask1, 10, 10, 10, 10, cv2.BORDER_CONSTANT, 0)
mask3 = cv2.morphologyEx(mask2, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_RECT, (201, 201)))

plt.figure(figsize=(18, 8))
plt.subplot(1, 6, 1), plt.imshow(maze[..., ::-1]), plt.title('White balanced image')
plt.subplot(1, 6, 2), plt.imshow(sat, 'gray'), plt.title('Saturation channel')
plt.subplot(1, 6, 3), plt.imshow(mask, 'gray'), plt.title('sat < 16')
plt.subplot(1, 6, 4), plt.imshow(mask1, 'gray'), plt.title('closed')
plt.subplot(1, 6, 5), plt.imshow(mask2, 'gray'), plt.title('border')
plt.subplot(1, 6, 6), plt.imshow(mask3, 'gray'), plt.title('rect')
plt.tight_layout(), plt.show()

enter image description here

So it seems to me that the mask2 which is making a border around the entire image, is not necessary. Why do we need the mask2?

I also found that the resolution of mask2 and mask3 are 2 pixels bigger in each dimensions:

maze.shape, sat.shape, mask.shape, mask1.shape, mask2.shape, mask3.shape
((4000, 1840, 3),
 (4000, 1840),
 (4000, 1840),
 (4000, 1840),
 (4002, 1842),
 (4002, 1842))

Why?


Solution

  • You really want to get these $ 6.9 dishes, he?


    For the four given images, I could get quite good results using the following workflow:

    • White balance the input image to enforce nearly white paper. I took this approach using a small patch from the center of the image, and from that patch, I took the pixel with the highest R + G + B value – assuming the maze is always centered in the image, and there are some pixels from the white paper within the small patch.
    • Use the saturation channel from the HSV color space to mask the white paper, and (roughly) crop that portion from the image.
    • On that crop, perform the existing reconstruction approach.

    Here are the results:

    maze.jpg

    Maze

    simple.jpg

    Simple

    middle.jpg

    Middle

    hard.jpg

    Hard

    That's the full code:

    import cv2
    import matplotlib.pyplot as plt
    import numpy as np
    from skimage.morphology import binary_erosion, reconstruction
    from skimage.morphology.convex_hull import convex_hull_image
    
    
    # https://stackoverflow.com/a/54481969/11089932
    def simple_white_balancing(image):
        h, w = image.shape[:2]
        patch = image[int(h/2-20):int(h/2+20), int(w/2-20):int(w/2+20)]
        x, y = cv2.minMaxLoc(np.sum(patch.astype(int), axis=2))[3]
        white_b, white_g, white_r = patch[y, x, ...].astype(float)
        lum = (white_r + white_g + white_b) / 3
        image[..., 0] = image[..., 0] * lum / white_b
        image[..., 1] = image[..., 1] * lum / white_g
        image[..., 2] = image[..., 2] * lum / white_r
        return image
    
    
    for file in ['maze.jpg', 'simple.jpg', 'middle.jpg', 'hard.jpg']:
    
        # Read image
        img = cv2.imread(file)
    
        # Initialize hull image
        h, w = img.shape[:2]
        hull = np.zeros((h, w), np.uint8)
    
        # Simple white balancing, cf. https://stackoverflow.com/a/54481969/11089932
        img = cv2.GaussianBlur(img, (11, 11), None)
        maze = simple_white_balancing(img.copy())
    
        # Mask low saturation area
        sat = cv2.cvtColor(maze, cv2.COLOR_BGR2HSV)[..., 1]
        mask = (sat < 16).astype(np.uint8) * 255
        mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE,
                                cv2.getStructuringElement(cv2.MORPH_RECT,
                                                          (31, 31)))
        mask = cv2.copyMakeBorder(mask, 1, 1, 1, 1, cv2.BORDER_CONSTANT, 0)
        mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN,
                                cv2.getStructuringElement(cv2.MORPH_RECT,
                                                          (201, 201)))
    
        # Find largest contour in mask (w.r.t. the OpenCV version)
        cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
        cnts = cnts[0] if len(cnts) == 2 else cnts[1]
        cnt = max(cnts, key=cv2.contourArea)
        x, y, w, h = cv2.boundingRect(cnt)
    
        # Crop to low saturation area
        cut = cv2.cvtColor(maze[y+1:y+1+h, x+1:x+1+w], cv2.COLOR_BGR2GRAY)
    
        # Use existing reconstruction approach on low saturation area
        h_c, w_c = cut.shape
        seed = np.zeros_like(cut)
        size = 40
        hh = h_c // 2
        hw = w_c // 2
        seed[hh-size:hh+size, hw-size:hw+size] = cut[hh-size:hh+size, hw-size:hw+size]
        rec = reconstruction(seed, cut)
        rec = cv2.erode(rec, np.ones((2, 2)), iterations=1)
    
        seed = np.ones_like(rec) * 255
        size = 240
        seed[hh-size:hh+size, hw-size:hw+size] = rec[hh-size:hh+size, hw-size:hw+size]
        rec = reconstruction(seed, rec, method='erosion').astype(np.uint8)
        rec = cv2.threshold(rec, np.quantile(rec, 0.25), 255, cv2.THRESH_BINARY_INV)[1]
    
        hull[y+1:y+1+h, x+1:x+1+w] = convex_hull_image(rec) * 255
    
        plt.figure(figsize=(18, 8))
        plt.subplot(1, 5, 1), plt.imshow(img[..., ::-1]), plt.title('Original image')
        plt.subplot(1, 5, 2), plt.imshow(maze[..., ::-1]), plt.title('White balanced image')
        plt.subplot(1, 5, 3), plt.imshow(sat, 'gray'), plt.title('Saturation channel')
        plt.subplot(1, 5, 4), plt.imshow(hull, 'gray'), plt.title('Obtained convex hull')
        plt.subplot(1, 5, 5), plt.imshow(cv2.bitwise_and(img, img, mask=hull)[..., ::-1])
        plt.tight_layout(), plt.savefig(file + 'output.png'), plt.show()
    

    Of course, there's no guarantee, that this approach will work for the next five images or so, you work on. In general, try to standardize the image acquisition (rotation, lighting) to get more consistent images. Otherwise, you'll end up needing some machine learning approach...

    ----------------------------------------
    System information
    ----------------------------------------
    Platform:      Windows-10-10.0.16299-SP0
    Python:        3.9.1
    PyCharm:       2021.1.1
    Matplotlib:    3.4.1
    NumPy:         1.20.2
    OpenCV:        4.5.1
    scikit-image:  0.18.1
    ----------------------------------------