I'm working on a tess4J project and using tess4j, i've gotten the coordinates of words in the image. The only problem is, these are coordinates for a TIFF image. My project involves writing a layer of text overr the image in a pdf document. I take it the resolution of a pdf document is 72dpi. So the coordinates are morphed and too widely placed. If i can bring down the resolution from 300 dpi to 72dpi and THEN pass the image to tessaract, wont i get the coordinates i need? If not, any alternatives? already tried multiplying the coordinates with 300/72. Surprisingly, that doesnt work.
Thanks in advance!
To convert from 300DPI to 72DPI, you need to multiply by 72/300, not the other way round. Do it in floating point or the multiplication first and division then, as in (x * 72) / 300. PDF units are always 1/72 of an inch.
Scaling down the original image is not a good idea, since the loss of information will reduce the output text quality.