I have a binary which has some text on different parts of the image like at the bottom, top, center, right middle center, etc.
Original Image
The areas I would like to focus on are the manually drawn regions shown in red.
I calculated the horizontal and vertical summations of the image and plotted them:
plot(sum(edgedImage1,1))
plot(sum(edgedImage1,2))
Can somebody give me explanation of what these plots are telling me about the original image with regards to the structure of which I explained above? Moreover, how could these plots help me extracting those regions I just manually drew in red?
There's nothing sophisticated about the sum operation. Simply put, sum(edgedImage1,1)
computes the sum of all rows for each column in the image and that is what you are plotting. Effectively, you are computing the sum of all non-zero values (i.e. white pixels) over all rows for each column. The horizontal axis in the plot denotes what row's sum you are observing. Similarly, sum(edgedImage,2)
computes the sum of all columns for each row of the image and that is what you are plotting.
Because your text is displayed in a horizontal fashion, sum(edgeImage,1)
won't be particularly useful. What is very useful is the sum(edgedImage,2)
operation. For lines in your image that are blank, the horizontal sum of columns for each row of your image should be a very small value whereas for lines in your image that contain text or strokes, the sum should be quite large. A good example of what I'm talking about is seen towards the bottom of your image. If you consult between rows 600 and 700, you see a huge spike in your plot as there is a lot of text that is surrounded between those rows.
Using this result, a crude way to determine what areas in your image that contain text or strokes in your case would be to find all rows that surpass some threshold. Combined with finding modes or peaks from the sum operation that was just performed, you can very easily localize and separate out each region of text.
You may want to smooth the curve provided by sum(edgedImage,2)
if you decide to determine how many blobs of text there are. Once you smooth out this signal, you will clearly see that there are 5 modes corresponding to 5 lines of text.