I have an API running on Docker Linux image, which uses Tesseract wrapper for reading text from images. Every time Tesseract is processing an image it logs a lot of warnings and communicates:
| Estimating resolution as 682
| Empty page!!
| Estimating resolution as 682
| Empty page!!
| Warning: Invalid resolution 0 dpi. Using 70 instead.
| Estimating resolution as 1408
One request invokes Tesseract up to 50 times, which causes logs to be a huge mess. To log information I need I use Microsoft.Extensions.Logging. I tried disabling the logging from Tesseract in appsettings.json like this:
{
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft": "Warning",
"Microsoft.Hosting.Lifetime": "Information",
"Tesseract": "Error"
}
},
as well as setting "Tesseract": "None" but none of it helped. I also tried looking it up in documentation of Tesseract, but didn't found anything. Is there any way to disable logs from Tesseract only?
Alright, I've found a little workaround to this. After initializing an instance of a Tesseract engine, to remove
| Warning: Invalid resolution 0 dpi. Using 70 instead.
I needed to set DPI manually for Engine:
_tesseractEngine.SetVariable("user_defined_dpi", "300");
And to remove "Empty page!!" either debug_file
needs to be set for NUL
,
_tesseractEngine.SetVariable("debug_file", "NUL");
Or DefaultPageSegMode
needs to be set correctly.
_tesseractEngine.DefaultPageSegMode = PageSegMode.SingleBlock;