I have some PDF documents in which their main content is Vector Graphics (bitmap graphics). Like the following.
IMPORTANT NOTE: These are the only type of operators in the PDF. It does not contain text, images or other type of objects. (I reviewed all the content using PDFBox debugger).
q
0.75 0 0 -0.75 36.12 573.96 cm
0 0 0 rg
0 0 m
2.24 0 l
2.24 5.92 l
3.04 5.92 l
3.04 0 l
5.28 0 l
5.28 -0.8 l
0 -0.8 l
0 0 l
h
f
Q
q
0.75 0 0 -0.75 43.800003 572.04 cm
0 0 0 rg
0 0 m
0 -1.44 -0.96 -1.76 -1.76 -1.76 c
-2.56 -1.76 -3.04 -1.28 -3.2 -0.96 c
-3.2 -0.96 l
-3.2 -3.36 l
-4 -3.36 l
-4 3.36 l
-3.2 3.36 l
-3.2 0.64 l
-3.2 -0.64 -2.56 -0.96 -1.92 -0.96 c
-1.12 -0.96 -0.8 -0.64 -0.8 0.16 c
-0.8 3.36 l
0 3.36 l
0 0 l
h
f
Q
.
.
.
Each block of "q" ended by "Q" seems to be a small image (character in the case of my document).
This is how it looks visually in Adobe Acrobat: Screenshot taken from Adobe Acrobat
I need to determine the bounding boxes values (dimensions such as X-Y coordinates and width and height), like if they were just one object. Like below: Bounding Box representation from Adobe Acrobat
As mentioned above I determined that each "character" is a block of "q and Q" operators in the PDF Content.
I wonder if we can get those dimensions (of the big bounding box) using JAVA and PDFBOX just like Adobe Acrobat is able to do it.
Following the same approach that is posted here:
pdfbox 2.0.2 > Calling of PageDrawer.processPage method caught exceptions
They mentioned that the logic should be placed on the "strokePath()" method, but for my case as mentioned by @TilmanHausherr, I used the "fillPath()" to write my logic there.
Be aware that the class you define should be extend from PDFGraphicsStreamEngine.