Search code examples
tesseracttess4j

What are the coordinates of a rectangle in tess4j doOCR?


I'm trying to use tess4j to scan multipage PDF files. I use the following code:

PdfUtilities.splitPdf(imageFile, outputFile, startPage, endPage);
List<IIOImage> imageList = ImageIOHelper.getIIOImageList(outputFile);
String result = instance.doOCR(imageList, null);

However, due to speed issues, I am only interested in scanning the top half (actually, even less, but for argument's sake) of each page. The API specifies that where I am currently passing null I can pass Rectangle rect, but I have seen no reference to what the coordinates of the rectangle refer to. The PDFs come from different providers if that makes any difference.


Solution

  • It specifies a region within an image's boundary, with (0,0) at the top left corner of the image.

    http://tess4j.sourceforge.net/docs/docs-3.0/