Search code examples
opencvtesseractopencv3.0python-tesseract

How to save dpi info in py-opencv?


import cv2

def clear(img):
    back = cv2.imread("back.png", cv2.IMREAD_GRAYSCALE)
    img = cv2.bitwise_xor(img, back)
    ret, img = cv2.threshold(img, 120, 255, cv2.THRESH_BINARY_INV)
    return img


def threshold(img):
    ret, img = cv2.threshold(img, 120, 255, cv2.THRESH_BINARY_INV)
    img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
    ret, img = cv2.threshold(img, 248, 255, cv2.THRESH_BINARY)
    return img


def fomatImage(img):
    img = threshold(img)
    img = clear(img)
    return img


img = fomatImage(cv2.imread("1566135246468.png",cv2.IMREAD_COLOR))
cv2.imwrite("aa.png",img)

This is my code. But when I tried to identify it with tesseract-ocr, I got a warning.

Warning: Invalid resolution 0 dpi. Using 70 instead.

How should I set up dpi?


Solution

  • AFAIK, OpenCV doesn't set the dpi of PNG files it writes, so you are looking at work-arounds. Here are some ideas...


    Method 1 - Use PIL/Pillow instead of OpenCV

    PIL/Pillow can write dpi information into PNG files. So you would:

    Step 1 - Convert your BGR OpenCV image into RGB to match PIL's channel ordering

    from PIL import Image
    RGBimage = cv2.cvtColor(BGRimage, cv2.COLOR_BGR2RGB)
    

    Step 2 - Convert OpenCV Numpy array onto PIL Image

    PILimage = Image.fromarray(RGBimage)
    

    Step 3 - Write with PIL

    PILimage.save('result.png', dpi=(72,72))
    

    As Fred mentions in the comments, you could equally use Python Wand in much the same way.


    Method 2 - Write with OpenCV but modify afterwards with some tool

    You could use Python's subprocess module to shell out to, say, ImageMagick and set the dpi like this:

    magick OpenCVImage.png -set units pixelspercentimeter -density 28.3 result.png
    

    All you need to know is that PNG uses metric (dots per centimetre) rather than imperial (dots per inch) and there are 2.54cm in an inch, so 72 dpi becomes 28.3 dots per cm.

    If your ImageMagick version is older than v7, replace magick with convert.


    Method 3 - Write with OpenCV and insert dpi yourself

    You could write your file to memory using OpenCV's imencode(). Then search in the file for the IDAT (image data) chunk - which is the one containing the image pixels and insert a pHYs chunk before that which sets the density. Then write to disk.

    It's not that hard actually - it's just 9 bytes, see here and also look at pngcheck output at end of answer.

    This code is not production tested but seems to work pretty well for me:

    #!/usr/bin/env python3
    
    import struct
    import numpy as np
    import cv2
    import zlib
    
    def writePNGwithdpi(im, filename, dpi=(72,72)):
       """Save the image as PNG with embedded dpi"""
    
       # Encode as PNG into memory
       retval, buffer = cv2.imencode(".png", im)
       s = buffer.tostring()
    
       # Find start of IDAT chunk
       IDAToffset = s.find(b'IDAT') - 4
    
       # Create our lovely new pHYs chunk - https://www.w3.org/TR/2003/REC-PNG-20031110/#11pHYs
       pHYs = b'pHYs' + struct.pack('!IIc',int(dpi[0]/0.0254),int(dpi[1]/0.0254),b"\x01" ) 
       pHYs = struct.pack('!I',9) + pHYs + struct.pack('!I',zlib.crc32(pHYs))
    
       # Open output filename and write...
       # ... stuff preceding IDAT as created by OpenCV
       # ... new pHYs as created by us above
       # ... IDAT onwards as created by OpenCV
       with open(filename, "wb") as out:
          out.write(buffer[0:IDAToffset])
          out.write(pHYs)
          out.write(buffer[IDAToffset:])
    
    ################################################################################
    # main
    ################################################################################
    
    # Load sample image
    im = cv2.imread('lena.png')
    
    # Save at specific dpi
    writePNGwithdpi(im, "result.png", (32,300))
    

    Whichever method you use, you can use pngcheck --v image.png to check what you have done:

    pngcheck -vv a.png
    

    Sample Output

    File: a.png (306 bytes)
      chunk IHDR at offset 0x0000c, length 13
        100 x 100 image, 1-bit palette, non-interlaced
      chunk gAMA at offset 0x00025, length 4: 0.45455
      chunk cHRM at offset 0x00035, length 32
        White x = 0.3127 y = 0.329,  Red x = 0.64 y = 0.33
        Green x = 0.3 y = 0.6,  Blue x = 0.15 y = 0.06
      chunk PLTE at offset 0x00061, length 6: 2 palette entries
      chunk bKGD at offset 0x00073, length 1
        index = 1
      chunk pHYs at offset 0x00080, length 9: 255x255 pixels/unit (1:1). <-- THIS SETS THE DENSITY
      chunk tIME at offset 0x00095, length 7: 19 Aug 2019 10:15:00 UTC
      chunk IDAT at offset 0x000a8, length 20
        zlib: deflated, 2K window, maximum compression
        row filters (0 none, 1 sub, 2 up, 3 avg, 4 paeth):
          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
          (100 out of 100)
      chunk tEXt at offset 0x000c8, length 37, keyword: date:create
      chunk tEXt at offset 0x000f9, length 37, keyword: date:modify
      chunk IEND at offset 0x0012a, length 0
    No errors detected in a.png (11 chunks, 76.5% compression).
    

    While I am editing PNG chunks, I also managed to set a tIME chunk and a tEXt chunk with the Author. They go like this:

    # Create a new tIME chunk - https://www.w3.org/TR/2003/REC-PNG-20031110/#11tIME
    year, month, day, hour, min, sec = 2020, 12, 25, 12, 0, 0    # Midday Christmas day 2020
    tIME = b'tIME' + struct.pack('!HBBBBB',year,month,day,hour,min,sec)
    tIME = struct.pack('!I',7) + tIME + struct.pack('!I',zlib.crc32(tIME))
    
    # Create a new tEXt chunk - https://www.w3.org/TR/2003/REC-PNG-20031110/#11tEXt
    Author = "Author\x00Sir Mark The Great"
    tEXt = b'tEXt' + bytes(Author.encode('ascii'))
    tEXt = struct.pack('!I',len(Author)) + tEXt + struct.pack('!I',zlib.crc32(tEXt))
    
    # Open output filename and write...
    # ... stuff preceding IDAT as created by OpenCV
    # ... new pHYs as created by us above
    # ... new tIME as created by us above
    # ... new tEXt as created by us above 
    # ... IDAT onwards as created by OpenCV
    with open(filename, "wb") as out:
       out.write(buffer[0:IDAToffset])
       out.write(pHYs)
       out.write(tIME)
       out.write(tEXt)
       out.write(buffer[IDAToffset:])
    

    Keywords: OpenCV, PIL, Pillow, dpi, density, imwrite, PNG, chunks, pHYs chunk, Python, image, image-processing, tEXt chunk, tIME chunk, author, comment