I'm developing a handwriting recognition project. one of the requirements of this project is getting an image input, this image only contains some character object in a random location, and firstly I must extract this characters to process in next step.
Now I'm confusing a hard problem like that: how to extract one character from black/white (binary)image or how to draw a bound rectangle of a character in black - white (binary) image?
Thanks very much!
If you are using MATLAB (which I hope you are, since it it awesome for tasks like these), I suggest you look into the built in function bwlabel() and regionprops(). These should be enough to segment out all the characters and get their bounding box information.
Some sample code is given below:
%Read image
Im = imread('im1.jpg');
%Make binary
Im(Im < 128) = 1;
Im(Im >= 128) = 0;
%Segment out all connected regions
ImL = bwlabel(Im);
%Get labels for all distinct regions
labels = unique(ImL);
%Remove label 0, corresponding to background
labels(labels==0) = [];
%Get bounding box for each segmentation
Character = struct('BoundingBox',zeros(1,4));
nrValidDetections = 0;
for i=1:length(labels)
D = regionprops(ImL==labels(i));
if D.Area > 10
nrValidDetections = nrValidDetections + 1;
Character(nrValidDetections).BoundingBox = D.BoundingBox;
end
end
%Visualize results
figure(1);
imagesc(ImL);
xlim([0 200]);
for i=1:nrValidDetections
rectangle('Position',[Character(i).BoundingBox(1) ...
Character(i).BoundingBox(2) ...
Character(i).BoundingBox(3) ...
Character(i).BoundingBox(4)]);
end
The image I read in here are from 0-255, so I have to threshold it to make it binary. As dots above i and j can be a problem, I also threshold on the number of pixels which make up the distinct region.
The result can be seen here: https://www.sugarsync.com/pf/D775999_6750989_128710