Search code examples
pythonocrtesseractpython-tesseract

Tesseract bounding box distance calculation


cropped_imenter image description here

I am trying to apply tesseract OCR to extract text from an image. At first, I tried image_to_data, which returned text but in some cases, it gave mispled char. Then I tried image_to_boxes which returned each char very correctly. But my problem is, I need to concatenate those characters to make full words. Can anyone suggest to me any idea how can I do this? I meant to say that I need to make three words from this dictionary such as "~Phone" , ":", "+88-02-5042248-50".

And finally, can anyone please explain what does mean by left, bottom, right, and top in this dictionary? Can I use them for finding distance between two char?


Solution

  • import cv2
    import numpy as np
    import pytesseract
    # Load image
    image = cv2.imread('muTYX.jpg')
    
    pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
    text = pytesseract.image_to_string(image, lang = 'eng', config='--psm 7 --oem 3 ')
    
    text = (text.split('\n'))
    print(text[0].split(' '))
    

    Output: ['Phone', ':', '+88-02-55042248-50']

    And the dictionary of image_to_boxes is like:

    #(index of character, start x-axis, start y-axis, end x-axis, end y-axis)
    P 53 50 64 76 0 
    h 63 50 75 77 0
    ...
    

    There is more info in the following website : https://nanonets.com/blog/ocr-with-tesseract/