Search code examples

Detect if an Image Pixel's Column or Row is a Line

I'm trying to detect the border of the scanned documents because it will help increase my OCR extraction rate. Borders are considered marginal noise so I have to get rid of them. Borders usually have the highest density in an image.

I had examine every column of pixels in an image and the column which has the highest density is probably a border, if and only if, it is a line. And that's where my problem arises. I don't know how to detect if the column of pixel is a line or not.

Any help would be very much appreciated.Thanks.


  • You use Hough line transform, but it will give lines for the data on which you need to do OCR.

    The simplest solution based on your question i can think of is this. Since its border, you can reduce the search space based on some threshold in width and height. For example, if your image is 'w x h' and your search space width 's' your search space will be '0 to s' 'w-s to w' '0 to s' 'h-s to h'.