Search code examples
c#logging.net-coretesseract

C#: How to ignore logs from Tesseract?


I have an API running on Docker Linux image, which uses Tesseract wrapper for reading text from images. Every time Tesseract is processing an image it logs a lot of warnings and communicates:

| Estimating resolution as 682

| Empty page!!

| Estimating resolution as 682

| Empty page!!

| Warning: Invalid resolution 0 dpi. Using 70 instead.

| Estimating resolution as 1408

One request invokes Tesseract up to 50 times, which causes logs to be a huge mess. To log information I need I use Microsoft.Extensions.Logging. I tried disabling the logging from Tesseract in appsettings.json like this:

 {
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft": "Warning",
      "Microsoft.Hosting.Lifetime": "Information", 
      "Tesseract": "Error"
    }
  },

as well as setting "Tesseract": "None" but none of it helped. I also tried looking it up in documentation of Tesseract, but didn't found anything. Is there any way to disable logs from Tesseract only?


Solution

  • Alright, I've found a little workaround to this. After initializing an instance of a Tesseract engine, to remove

    | Warning: Invalid resolution 0 dpi. Using 70 instead.

    I needed to set DPI manually for Engine:

    _tesseractEngine.SetVariable("user_defined_dpi", "300"); 
    

    And to remove "Empty page!!" either debug_file needs to be set for NUL,

    _tesseractEngine.SetVariable("debug_file", "NUL");
    

    Or DefaultPageSegMode needs to be set correctly.

    _tesseractEngine.DefaultPageSegMode = PageSegMode.SingleBlock;