I am working with multi-page scanned text documents and I use FineReader 12 SDK as a underlying OCR engine. Sometimes, the document is scanned upside down or different orientation and causes all the resulted recognition characters are to be unrecognized symbols.
Any help appreciated.
This problem can be solved using custom .ini processing profiles. You can automatically detect orientation and skew using the right properties, then apply or prohibit orientation correction and/or deskewing.
In your code between Engine initialization and recognition, call this method as described in the FRE documentation, section Working with Profiles
IEngine::LoadProfile
Create a new file document.ini
somewhere in your project and pass it to this method call in order to tell the SDK to check for properties in this file before processing your files.
Add these lines in your freshly created file:
[PageProcessingParams]
PerformPreprocessing = TRUE <- allows engine to preprocess image
PerformAnalysis = TRUE
PerformRecognition = TRUE
[PagePreprocessingParams]
CorrectGeometry = TSPV_Auto
CorrectInvertedImage = TRUE
CorrectOrientation = TRUE <- correct orientation automatically
CorrectSkew = TSPV_Yes <- correct skew automatically
[OrientationDetectionParams]
OrientationDetectionMode = ODM_Normal <- detect orientation automatically
ProhibitClockwiseRotation = FALSE |
ProhibitCounterclockwiseRotation = FALSE <-| allow all orientations
ProhibitUpsideDownRotation = FALSE |
If you do not want to use a file for setting these properties for any reason, you can call them in your code. Have a glance at the documentation describing all the props objects tree for that. Using a file is way easier to understand what you are doing without browsing hundred lines of code.
For your language issue, I suggest you to use RecognizerParams and enforce specific properties. Again, have a look at the documentation for custom profiles as it is pretty powerful.
[RecognizerParams]
TextLanguage = English <- force english
LanguageDetectionMode = TSPV_No <- TSPV_Yes or TSPV_No are acceptable values
After doing this, you should be good to go, and all your image files should be close do 0° orientation for processing. Choosing a language based on document orientation is a very specific workflow, the only option will be to code it.
Good luck on your project !