So I'm trying to write a program that will read in numbers of the graph (and then do stuff but that's irrelevant). I've got the following code, which works mostly.
img_resource_path = resource_path(img_path)
pytesseract.tesseract_cmd = PATH_TO_TESSERACT
img = cv2.imread(img_resource_path, 0)
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
thresh = 255 - cv2.threshold(img, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
x,y,w,h = 0, 615, 680, 200
ROI = thresh[y:y+h,x:x+w]
text = pytesseract.image_to_string(ROI, lang='eng')
The issue is that there is a black line on top of some of the lines of the table, which makes Tesseract read the characters incorrectly. It should output -90.58dB but it outputs -950.58dB.
How do I make it so it ignores the black line on top? (Picture attached with what it looks like)
Edit: Can I create my own training data and use that. A few online sources including the Tesseract docs said that likely retraining will not help. Any opinions?
Just try with easyocr. It using tesseract engine for OCR operation. install easyocr by pip install easyocr
import easyocr
reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext('3.png')
for detection in result:
print(detection)
output is,
([[8, 8], [458, 8], [458, 88], [8, 88]], '-90.58 dB', 0.7896991366270714)
([[7, 84], [460, 84], [460, 168], [7, 168]], '-90.58 dB', 0.7465829283820857)