I have some PNG files with multiples sentences in two different colors Black (Davy Gray) and Light Brown (Mushroom).
I'm only interested in the Black text so I tried changing the color of the light brown text to the background color using Input.ReplaceColor
but there's many shades of that color and I always end up with some weird characters as a result of the small residues left.
Here's my actual code
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
var ContentArea = new Rectangle() { X = 872 , Y = 130, Height = 900, Width = 725 };
Input.AddImage(@"C:\OCR\Capture (" + i + ").PNG", ContentArea);
Input.ReplaceColor(Color.FromArgb(185, 163, 143), Color.FromArgb(235, 226, 216), 25);
Input.Sharpen();
Input.ToGrayScale();
var Result = Ocr.Read(Input);
richTextBox1.AppendText(Result.Text + Environment.NewLine);
richTextBox1.SelectionStart = richTextBox1.Text.Length;
richTextBox1.ScrollToCaret();
}
Edit : The answer is "No" for now, hopefully they release this feature in the future.
The only option for now is to play with colors until you find the best parameters.
If you have a better alternative than IronOCR and free (even if only for dev), I'll gladly take it.
The answer below was edited in response to comment.
Since the color you wish to eliminate is not a single shade, you could search for all pixels in a color range and replace them all with the background color.
I haven’t used IronTesseract before, so I don’t know if it has this feature, but you can use Windows Bitmap functions to do it as follows:
System.Drawing.Bitmap image = new Bitmap("BsRyL.png");
Color c1 = Color.FromArgb(180, 157, 136); //lower color
Color c2 = Color.FromArgb(238, 228, 219); //upper color
Color bkColor = Color.FromArgb(235, 226, 216); //background
for (int x = 0; x < image.Width; x++)
for (int y = 0; y< image.Height; y++)
{
Color c = image.GetPixel(x, y);
if (c.R >= c1.R && c.R <= c2.R && c.G >= c1.G && c.G <= c2.G && c.B >= c1.B && c.B <= c2.B)
image.SetPixel(x, y, bkColor);
}
image.Save("FilledWithBackgroundNL.png", System.Drawing.Imaging.ImageFormat.Png);
The image filled background color looks like this:
This pixel-by-pixel manipulation is suitable if your images are all small like the sample you provided or you don’t care about performance. If you’re dealing with larger images (in the megapixel range), working with individual pixels can be slow.
Another way to do this is to use an imaging toolkit such as LEADTOOLS (Disclaimer: I’m a LEADTOOLS employee). The code looks like this:
Leadtools.Codecs.RasterCodecs codecs = new Leadtools.Codecs.RasterCodecs();
Leadtools.RasterImage image = codecs.Load("BsRyL.png");
var c1 = new Leadtools.RasterColor(180, 157, 136); //lower color
var c2 = new Leadtools.RasterColor(238, 228, 219); //upper color
image.AddColorRgbRangeToRegion(c1, c2, Leadtools.RasterRegionCombineMode.Set);
var backgroundColor = new Leadtools.RasterColor(235, 226, 216);
Leadtools.ImageProcessing.FillCommand fill = new Leadtools.ImageProcessing.FillCommand(backgroundColor);
fill.Run(image);
codecs.Save(image, "FilledWithBackground.png", Leadtools.RasterImageFormat.Png, 24);
This could be useful if the images are large and higher performance is needed.