I'm using my digital camera as a quick and dirty scanner. Resolution is actually around 300dpi, which is quite reasonable. But my camera produces a color image, which I want reduced to a bitmap. I do not want to dither the image; I'm looking for what I would get if I put the document through a black-and-white scanner. Converting a JPEG to a greyscale image is easy and standard using djpeg -grayscale
. The hard part is deciding which gray pixels should be white and which should be black.
The pbmplus tools offer
djpeg -grayscale -pnm img.jpg | pgmtopbm -threshold -value $v > img.pbm
But the killer is that value $v
. Good values seem to range anywhere from 0.3 to 0.6, and repeated trial and error by hand is killing me. (For those more familiar with ImageMagick, the $v
at hand is the value of the -black-threshold
parameter.)
I suppose I could build a GUI that would help me find a threshold faster by hand, but what I'm really looking for is and algorithm to set threshold to convert a greyscale image to a clean bitmap. Ideally this would work just by examining the structure of the grayscale image!
It's not exactly what I'd hoped for, but the mkbitmap
program from the potrace project does a nice job converting photographs into bitmaps. There tend to be a few artifacts at edges, but it does a far better job eliminating irrelevant low-spatial-frequency signal, which is not possible using simple thresholding.