Sprite Sheet Detect Individual Sprite Bounds Automatically
Here's an approach
- Convert image to grayscale
- Otsu's threshold to obtain binary image
- Perform morphological transformations to smooth image
- Find contours
- Iterate through contours to draw bounding rectangle and extract ROI
After converting to grayscale, we Otsu's threshold to obtain a binary image
Next we perform morphological transformations to merge each sprite into a single contour
From here we find contours, iterate through each contour, draw the bounding rectangle, and extract each ROI. Here's the result
and here's each saved sprite ROI
I've implemented this method using OpenCV and Python but you can adapt the strategy to any language
import cv2
image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=2)
dilate = cv2.dilate(close, kernel, iterations=1)
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
sprite_number = 0
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
ROI = image[y:y+h, x:x+w]
cv2.imwrite('sprite_{}.png'.format(sprite_number), ROI)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
sprite_number += 1
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()
How about this? The only downside is that you'll need a writable version of your image to mark visited pixels in, or the floodfill will never terminate.
Process each* scan line in turn
For each scanline, walk from left to right, until you find a non-transparent pixel P.
If the location of P is already inside a known bounded box
Continue to the right of the bounded box
Else
BBox = ExploreBoundedBox(P)
Add BBox to the collection of known bounded boxes
Function ExploreBoundedBox(pStart)
Q = new Queue(pStart)
B = new BoundingBox(pStart)
While Q is not empty
Dequeue the front element as P
Expand B to include P
For each of the four neighbouring pixels N
If N is not transparent and N is not marked
Mark N
Enqueue N at the back of Q
return B
You don't need to process every scanline, you could do every 10th, or every 30th scanline. As long as it doesn't exceed the minimum sprite height.