Skew correction is common with OCR and my skew correction code using houghLinesP() and finding the angle of skewed image is working fine. What I wanted to ask was how do you deal with an image skewed 90 degrees? As the angle would probably be 0 degrees and tesseract wont extract any text.
So how to deal with this problem?
You can use osd
function of tesseract
for skew correction. Even though for normal images skew correction using traditional image processing may work, but for scanned and low-quality images, traditional approaches might fail. So better to go with osd functionality provided by tesseract
import pytesseract
import cv2
import numpy as np
###function to rotate image
def rotate_bound(image, angle):
"""Rotate image with the given angle
:param type image: input image
:param type angle: Angle to be rotated
:return: rotated image
:rtype: numpy.ndarray
"""
(h, w) = image.shape[:2]
### centroid
(cX, cY) = (w // 2, h // 2)
### creating rotation matrix
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
return cv2.warpAffine(image, M, (nW, nH))
###read input image
image=cv2.imread('path/to/image.jpg')
###getting orientation info
newdata=pytesseract.image_to_osd(image)
###filter angle value
angle=re.search('(?<=Rotate: )\d+', newdata).group(0)
print('osd angle:',angle)
### rotating image with angle
skew_corrected_image=rotate_bound(image,float(angle))
### you can add tesseract OCR call here