Search code examples
ocrtesseractpython-tesseract

Python OCR Tesseract, find a certain word in the image and return me the coordinates


I wanted your help, I've been trying for a few months to make a code that finds a word in the image and returns the coordinates where that word is in the image. I was trying this using OpenCV, OCR tesseract, but I was not successful, could someone here in the community help me?

I'll leave an image here as an example:

enter image description here


Solution

  • Here is something you can start with:

    import pytesseract
    from PIL import Image
    
    
    pytesseract.pytesseract.tesseract_cmd = r'C:\<path-to-your-tesseract>\Tesseract-OCR\tesseract.exe'
    
    img = Image.open("img.png")
    data = pytesseract.image_to_data(img, output_type='dict')
    boxes = len(data['level'])
    
    for i in range(boxes):
        if data['text'][i] != '':
            print(data['left'][i], data['top'][i], data['width'][i], data['height'][i], data['text'][i])
    

    If you have difficulties with installing pytesseract see: https://stackoverflow.com/a/53672281/18667225

    Output:

    153 107 277 50 Palavras
    151 197 133 37 com
    309 186 154 48 R/RR
    154 303 126 47 Rato
    726 302 158 47 Resto
    154 377 144 50 Rodo
    720 379 159 47 Arroz
    152 457 160 48 Carro
    726 457 151 46 Ferro
    154 532 142 50 Rede
    726 534 159 47 Barro
    154 609 202 50 Parede
    726 611 186 47 Barata
    154 690 124 47 Faro
    726 685 288 50 Beterraba
    154 767 192 47 Escuro
    726 766 151 47 Ferro