I am currently using spyder via anaconda with python 3.8.5 on windows 10 and when I run this code:
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img_path ='img/gotta-go-fast.jpg'
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
result = pytesseract.image_to_string(img, lang='eng')
print(result)
The IPython console in spyder will automaticlly be cleared, but the same will not happen if I only write to a text document like this:
result = pytesseract.image_to_string(img, lang='eng')
with open('text_result.txt', mode ='w') as file:
file.write(result)
Is there anyway to fix this?
Found what was happening. When pytessaract read the text a \x0c
command were added in a new line after the last read line.
Solved the issue by removing the command like this:
result = pytesseract.image_to_string(img, lang='eng')
arr = result.split('\n')[0:-1]
result = '\n'.join(arr)