Search code examples
pythonnumpypytesserpython-mss

Pytesseract, trying to detect text from on screen


I'm using MSS in conjunction with pytesseract to try and read on-screen to determine a string of characters from the region being monitored. My code is as follows:

import Image
import pytesseract
import cv2
import os
import mss
import numpy as np

with mss.mss() as sct:
    mon = {'top': 0, 'left': 0, 'width': 150, 'height': 150}

    im = sct.grab(mon)
    im = np.asarray(im)
    im_gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
    #im_gray = plt.imshow(im_gray, interpolation='nearest')

    cv2.imwrite("test.png", im_gray)
    #cur_dir = os.getcwd()
    text = pytesseract.image_to_string(Image.open(im_gray))
    print(text)

    cv2.imshow("Image", im)
    cv2.imshow("Output", im_gray)
    cv2.waitKey(0)

And I get returned with the following error: AttributeError: 'numpy.ndarray' object has no attribute 'read'

I have also tried converting it back to an image using pyplot as indicated by the commented line in the code sample. However that prints back the error: TypeError: img is not a numpy array, neither a scalar

I'm somewhat new to Python (just started dabbling with it on Sunday). However, I've been rather successful with my other attempts at detecting images. But, to reach my end goal, I'll need to be able to read characters on screen. They will be the same font and the same size, consistently so I don't have to worry about scaling issues, but for the time being I'm trying to understand how it works by storing an image in memory (without saving to file) from the recycle bin icon on desktop, and trying to grab the string "Recycle Bin" from the image.

UPDATE I think I may have some breakthrough, but if I am tryin to display the stream at the same time, there is some issues. However, I may be able to process the stream fast enough by using temporary files.

My updated code is as follows:

from PIL import Image
from PIL import ImageGrab
import pytesseract
import cv2
import os
import mss
import numpy as np
from matplotlib import pyplot as plt
import tempfile

png = tempfile.NamedTemporaryFile(mode="wb")

with mss.mss() as sct:
    #while True:
    mon = {'top': 0, 'left': 0, 'width': 150, 'height': 150}
    im = sct.grab(mon)
    im_array = np.asarray(im)
    #im_gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
    #with tempfile.NamedTemporaryFile(mode="wb") as png:
    png.write(im_array)
    im_name = png.name
    print(png.name)

    #cv2.imwrite("test.png", im_gray)
    #cur_dir = os.getcwd()
    #text = pytesseract.image_to_string(Image.open(im_name))
    #print(text)
    cv2.imshow("Image", im_array)
#cv2.imshow("Output", im_gray)
cv2.waitKey(0)

This currently spits out a permission is denied error, which is as follows:

File "C:\Python\Python36-32\Lib\idlelib\ocr.py", line 27, in <module>
    text = pytesseract.image_to_string(Image.open(im_name))
  File "C:\Python\Python36-32\lib\site-packages\PIL\Image.py", line 2543, in open
    fp = builtins.open(filename, "rb")
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\JMCOLL~1\\AppData\\Local\\Temp\\tmp7_mwy2k9'

I am skeptical that this is normal, and I will be trying this update on my laptop at home. It could be due to restrictions on the work laptop, which I just don't have time to work around.

I am rather confused why displaying the image without the while True: loop works fine, as a screenshot. However, putting it in a while True: loop causes the window to freeze.


Solution

  • I can get this code working:

    import time
    
    import cv2
    import mss
    import numpy
    import pytesseract
    
    
    mon = {'top': 0, 'left': 0, 'width': 150, 'height': 150}
    
    with mss.mss() as sct:
        while True:
            im = numpy.asarray(sct.grab(mon))
            # im = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
    
            text = pytesseract.image_to_string(im)
            print(text)
    
            cv2.imshow('Image', im)
    
            # Press "q" to quit
            if cv2.waitKey(25) & 0xFF == ord('q'):
                cv2.destroyAllWindows()
                break
    
            # One screenshot per second
            time.sleep(1)
    

    The sleep time may be a good thing to not explode your CPU.