How to detect text blocks and columns in pdf with tess4j

I'm new to Tesseract (tess4j), managed to used main features like reading the text or getting the words positions both from image or pdf, rotating etc..

I can't find, and not sure if it is possible to easily detect blocks of text (paragraphs or columns). Also, if there are some other blocks in pdf like images or something else, is it possible to get it somehow, or at least to get the position of the block (box).

Solution

You can use TessBaseAPIGetComponentImages API method, as follows:

Boxa boxes = api.TessBaseAPIGetComponentImages(handle, TessPageIteratorLevel.RIL_BLOCK, TRUE, null, null);

Check Tess4J unit tests for complete examples.

Keep numbers which appear in both columns, in J lang
Differentiation in J
How to reshape an array with an arbitrary size in one dimension?
Why is Insert (fold) right associative
Write 4 : 'x&{.&.;: y' tacitly
Alignment issue when printing formatted prime numbers in J language
How can I define a verb in J that applies a different verb alternately to each atom in a list?
How to get user input in the J programming language
How to unbox a list of boxed lists of differing lengths in J?
How can I fix 'noun result was required' error in J?
In j, how can I define a verb locally in one scope and pass it to a defined adverb?
Convert boxed array to normal array?
Read column of CSV file as array
Replace atom in array of strings
What does the dyad `=` do to boxed strings?
Index of minimum element using J
How can I take the outer product of string vectors in J?
Building an array of verbs in J
Reading in multidigit command line parameter
Amend with bond to new data shows unexpected behaviour
How to turn a table or matrix into a (flat) list in J
How to run dissect in J?
How to define selection using index function in J
How to exit the J console?
Find 4-neighbors using J
Writing custom verbs in J
How do I negate a selector in J lang?
How to use arbitrary selector in interchange in J lang?
different result once square root is added inside tacit
Sum of arrays with repeated indices