I'm working on a C# OCR program (project for my own learning purposes, nothing commercial-quality) that will recognize Hebrew characters. I plan to do this by separating the glyphs from the images and then applying template matching methods.
Where I'm at
I've got it now so that I can separate individual glyphs out of images. Each glyph is represented with a 2D array of pixels. For instance, the character "bet" looks something like:
..........
.*******..
.......*..
.......*..
.********.
..........
where "." represents an empty space and "*" represents a filled-in pixel.
I'm now to the point where I'm going to apply a template matching algorithm to identify what glyph this 2D array of pixels represents (in this case, it should match the "bet" template).
The issue
I'm having trouble finding a simple explanation of a good template matching algorithm (most of what I find are theses or links to code libraries), and was wondering if someone knew of any I might study.
I'd like to emphasize that I want to do this by hand and not simply use a library. I am willing to study how a library solves the problem, however, if it's not split into fifteen bajillion different pieces. :)
I'd also be willing to hear if there's any better methods for doing what I'm trying to do.
Generate a number for each template , since it is array of pixels and if you associate each pixel with a number( like 0,2,4,8,16 etc) and empty pixel is 0 and filled pixel is 1.
Then for each glyph also calculate the total and match them.