Search code examples
image-processingocrpattern-recognition

character matching in grayscale image


I made patterns: images with the "A" letter of different sizes (from 12 to 72: 12, 14, .., 72) And I tested the method of pattern matching and it gave a good results. One way to select text regions from image is to run that algorithm for all small and big letters and digits of different sizes. And fonts! I don't like it. Instead of it I want to make something like a universal pattern or better to say: scanning image with different window sizes and select those regions where some function (probability of that there is a character at that window) is more than some fixed value. Do you know any methods or ideas to make that function? It must work with original image (grayscale).


Solution

  • I suppose you are developing OCR, right?

    You decided to go quite unusual way since everyone else do matching on bi-tonal images. This makes everything much simplier. Once you degradated it properly (which is very difficult task by itself), you do not have to deal with different brightness levels, take care about uneven background, etc. And sure, less computation resources needed. However, is doing everything in grayscale is actually your goal and you want to show other OCR scientists that it is actually doable - well, I wish you good luck then.

    Approach of letters location you described is very-very-very computation intesive. You have to scan whole image (image_size^2), then match with pattern ( * pattern_size^2) and then do it for each pattens ( * pattern_num ). This will be incredibly slow.

    Instead try to simplify your algorithm to break it to two stages. First should look for some features on picture (like connected dark regions, or split image on large squares and throw away all light ones) and only then perform pattern matching on small number of found areas. This is all at least N^2, and you could try to reduce complexity to working on rows or columns of image first (by creating histogram). So there is a lot of different simplification methods you can try to play with.

    After you have located those objects on picture and going to match patterns on them, you actually know their size, so you don't have to store letter A in all sizes, you can just rescale original image of object to the size say 72, and match it.

    As to fonts - you don't really have much choice here, you will need to match against all possible shapes of A to make sure you found A. But once you match against just one size of A - you have more computing power to try different A's.