I think that most OCR tools are used for reading documents. I'm trying to make a program that reads the post-result screen from a game. I was wondering if it's possible using some sort of workaround (I'm new to OCR tools).
An example of the image.
A simple program I tried using from the internet:
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\\Users\\labib\\AppData\\Local\\Tesseract-OCR\\tesseract.exe'
img = cv2.imread('test2.png')
text = pytesseract.image_to_string(img)
print(text)
I then tried different thresholds, and tried grayscaling but it didn't change the result that much.
I then thought about making a function that would first crop the image into a table and read the values from the cells of the columns instead? I don't know if that would make it easier on the OCR tool.
Something like this. I would then put the data from the image into a spreadsheet (which I think I can do)
My question is, how would I approach reading an image that is not a document and is difficult to read. (my current issue is reading the text on the image itself).
It is similar to this question Preserving indentation with Tesseract OCR 4.x
You can try this code. It can output most of what you want. To really fit your need, you must filter unnecessary icon, image, etc.
import cv2
import pytesseract
from pytesseract import Output
import pandas as pd
img = cv2.imread("sxFRauD.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
gauss = cv2.GaussianBlur(thresh, (3, 3), 0)
custom_config = r'-l eng --oem 3 --psm 6 -c preserve_interword_spaces=1 -c tessedit_char_whitelist="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-:. "'
d = pytesseract.image_to_data(gauss, config=custom_config, output_type=Output.DICT)
df = pd.DataFrame(d)
# clean up blanks
df1 = df[(df.conf != '-1') & (df.text != ' ') & (df.text != '')]
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
# sort blocks vertically
sorted_blocks = df1.groupby('block_num').first().sort_values('top').index.tolist()
for block in sorted_blocks:
curr = df1[df1['block_num'] == block]
sel = curr[curr.text.str.len() > 3]
# sel = curr
char_w = (sel.width / sel.text.str.len()).mean()
prev_par, prev_line, prev_left = 0, 0, 0
text = ''
for ix, ln in curr.iterrows():
# add new line when necessary
if prev_par != ln['par_num']:
text += '\n'
prev_par = ln['par_num']
prev_line = ln['line_num']
prev_left = 0
elif prev_line != ln['line_num']:
text += '\n'
prev_line = ln['line_num']
prev_left = 0
added = 0 # num of spaces that should be added
if ln['left'] / char_w > prev_left + 1:
added = int((ln['left']) / char_w) - prev_left
text += ' ' * added
text += ln['text'] + ' '
prev_left += len(ln['text']) + added + 1
text += '\n'
print(text)
cv2.waitKey(0)
cv2.destroyAllWindows()
POST-ACTION REPORT
CUSTOM GAME 0:04 e
MATCH SCOREBOARD
DEFEAT Y Oo j8 kK al
MM Zz ezswims.ENMU a 2188 6 0 8 52
a Shiro.ENMU 8 2040 4 4 8 52
SH RubbaDucky.ENMU t 1721 3 1 9 44
ICEDrgon29.ENMU a 1710 3 0 8 69
3 Gongshow.ENMU a 1375 1 1 8 56
2 Nemesis.ERAU 4930 9 3 2 41
2 Phantom.ERAU 4895 9 3 4 50
7 2 Reggie.ERAU 4630 6 o 2 80
2 D4NG3RZON3.ERAU 4510 6 1 5 53