I'm trying to make a telegram bot, one of the functions of which is text recognition from an image, everything works fine on Windows, but as soon as I switch to Linux, I immediately encounter the same kind of exceptions, at first I thought that I was incorrectly specifying the path pytesseract.pytesseract.tesseract_cmd
(since the sites I visited wrote exactly this, but after carefully rechecking everything, I did not find any error)
Here is my code:
from telebot import types
from googlesearch import search
from PIL import Image
import pytesseract
import cv2
import os
import numpy as np
import telebot
import config
bot = telebot.TeleBot(config.token)
@bot.message_handler(content_types= ["photo"])
def answer_to_photo(message):
statuss = ['creator', 'administrator', 'member']
user_status = str(bot.get_chat_member(chat_id='chat id', user_id=message.from_user.id).status)
if user_status in statuss:
pytesseract.pytesseract.tesseract_cmd = r'/home/shalor1k/.local/bin/pytesseract'
file_info = bot.get_file(message.photo[len(message.photo) - 1].file_id)
downloaded_file = bot.download_file(file_info.file_path)
src = r'C:\bot\photo' + message.photo[1].file_id
with open(src, 'wb') as new_file:
new_file.write(downloaded_file)
bot.reply_to(message, 'Processing your photo')
image = src
preprocess = "thresh"
image = cv2.imread(image)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
if preprocess == "tresh":
gray = cv2.threshold(gray, 0, 255,
cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
elif preprocess == "blur":
gray = cv2.median.Blur(gray, 3)
filename = "{}.png".format(os.getpid())
cv2.imwrite(filename, gray)
text = pytesseract.image_to_string(Image.open(filename), lang = 'rus')
os.remove(filename)
os.remove(src)
The text of the exception:
File "main_bot_for_server.py", line 67, in answer_to_photo
text = pytesseract.image_to_string(Image.open(filename), lang = 'rus')
File "/home/shalor1k/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 370, in image_to_string
return {
File "/home/shalor1k/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 373, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "/home/shalor1k/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 282, in run_and_get_output
run_tesseract(**kwargs)
File "/home/shalor1k/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 258, in run_tesseract
raise TesseractError(proc.returncode, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (2, 'Usage: pytesseract [-l lang] input_file')
The first problem was that the binaries of the tesseract ocr itself were not installed. The second problem was that the required language packs were not installed