I'm trying to read the following image :
try:
import Image
except ImportError:
from PIL import Image
import pytesseract as tes
results = tes.image_to_string(Image.open('./test.png'),boxes=True)
print(results)
And here is the result I have :
_ 239 780 263 787 0
. 239 758 263 767 0
L 235 737 263 761 0
1 220 763 229 783 0
1 220 741 229 761 0
‘ 129 763 137 784 0
1 129 741 136 761 0
1 220 650 229 670 0
‘ 220 628 229 648 0
F 235 537 263 561 0
. 239 531 263 540 0
A 239 511 268 534 0
_ 199 554 223 561 0
I 260 401 268 421 0
r 235 424 263 448 0
. 239 418 263 427 0
_ 239 398 263 404 0
{ 220 424 229 444 0
I 220 401 229 421 0
“ 220 288 229 331 0
What does this mean ? How I can interpret this result ?
Thanks a lot!
As you set boxes=True
in tes.image_to_string()
, the output is in box file format which the first letter in the line is the character recognized and then the bounding box coordinates of an occurrence of that character in the image. If boxes=False
, tesseract
will only output the characters recognized.
The image you are trying to OCR
is the 7-segment digits, you may need to have a trained (language) data for 7-segment digits in order to get a good result.