Optical Character Recognition with Python - Auto Cropping

I am working on various OCR tasks, pre-processing with Python and analyzing with Tesseract.

The latest problem is how to crop an image with images within e.g. a scanned image of 6 business cards or a photo of a board with two distinct sections. I would like to turn said business cards on one image (.jpg, .png) into 6 images.

Ideally, I would like to do this in Python (R as well), but I'm open to any and all suggestions. Thanks.

Solution

By using opencv you may be abile to find their contours, like they do in their documentation about finding contours or here.

I also wrote a grid detector, if your card are all the same size, it may be a source of inspiration, if they're not, it may be a source of inspiration too ... https://github.com/julienpalard/grid-finder