How to extract data from a specific rectangular area in a PDF using Java?

I am trying to extract data from a specific rectangular region specified by two coordinates given inside a PDF. Is it possible to do this in a PDF or would I have to convert it into a image and use OCR? If so, does PDFBox or iText include a way to analyze images via OCR? Thanks!

Solution

If the area is text. use pdfbox,

PDDocument document = PDDocument.load(new File("target.pdf"));
PDFTextStripperByArea stripper = new PDFTextStripperByArea();
stripper.setSortByPosition(true);
Rectangle rect = new Rectangle(35, 375, 340, 204);
stripper.addRegion("class1", rect);
stripper.extractRegions(document.getPage(1));
System.out.println(stripper.getTextForRegion("class1"))

'wsimport' is not recognized error in command prompt
Best way to compare two JSON files in Java
Java get month sort name from date
Obtain and download Javadoc (JDK API documentation) to a local file for offline reading
How to get the number of days in a specific month using Java Calendar?
Custom Spring annotation for request parameters
License for package Android SDK Platform 29 not accepted
Java Compile Time Error: reached end of file while parsing
ShellIpcClient and NonCelloThread errors java
How to verify a signature from the Phantom wallet?
FirebaseAuth - Get tokenId in Java backend
How to hide constructor on a Java record that offers a public static factory method?
Is it possible to get MariaDB4J to work on an M1 Mac?
Cannot run simple compiled java program?
Getting IntelliJ to generate Java Sources from Proto files
Insert a java string constant in a quarkus qute template?
Why is the run button not working in Eclipse?
Stuck on Card/Deck exercise from Java official tutorial
Spring Batch - Deleting metadata post job completion throws error - Incorrect result size: expected 1, actual 0
Simple export and import of a SQLite database on Android
How to serialize a date to a specific format?
How to make the Youtube's rotating spinner loading screen on Java Swing
How to sort List<Integer[]> in java?
How to prevent spring boot from auto creating instance of bean 'entityManagerFactory' at startup?
Sharing instance of a class between multiple tests running in parallel in Junit5
Launch4J not recognizing Eclipse Temurin OpenJDK Java 17
Turn my stack into a string?
How can I document or exclude the generated BuildConfig class in my documentation?
Is it a bad practice to catch Throwable?
Java: Right Click Copy Cut Paste On TextField