Search code examples
algorithmimageimage-processingcomputer-visionsprite

How to automatically detect and crop individual sprite bounds in sprite sheet?


Given a sprite sheet like this:

Sprite Sheet Example

I would like to write an algorithm that can loop through the pixel data and determine the bounding rectangle of each discreet sprite.

If we assume that for each pixel X, Y that I can pull either true (pixel is not totally transparent) or false (pixel is totally transparent), how would I go about automatically generating the bounding rectangles for each sprite?

The resulting data should be an array of rectangle objects with {x, y, width, height}.

Here's the same image but with the bounds of the first four sprites marked in light blue:

Sprite Sheet With Bounds

Can anyone give a step-by-step on how to detect these bounds as described above?


Solution

  • Here's an approach

    • Convert image to grayscale
    • Otsu's threshold to obtain binary image
    • Perform morphological transformations to smooth image
    • Find contours
    • Iterate through contours to draw bounding rectangle and extract ROI

    After converting to grayscale, we Otsu's threshold to obtain a binary image

    enter image description here

    Next we perform morphological transformations to merge each sprite into a single contour

    enter image description here

    From here we find contours, iterate through each contour, draw the bounding rectangle, and extract each ROI. Here's the result

    enter image description here

    and here's each saved sprite ROI

    enter image description here

    I've implemented this method using OpenCV and Python but you can adapt the strategy to any language

    import cv2
    
    image = cv2.imread('1.jpg')
    original = image.copy()
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
    close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=2)
    dilate = cv2.dilate(close, kernel, iterations=1)
    
    cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    
    sprite_number = 0
    for c in cnts:
        x,y,w,h = cv2.boundingRect(c)
        ROI = image[y:y+h, x:x+w]
        cv2.imwrite('sprite_{}.png'.format(sprite_number), ROI)
        cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
        sprite_number += 1
    
    cv2.imshow('thresh', thresh)
    cv2.imshow('dilate', dilate)
    cv2.imshow('image', image)
    cv2.waitKey()