I am performing full Page Offline Handwriting Recognition with Deep Learning.
The main idea is to build the model that can take one line of text image and give it's corresponding text. For this main task is do line segmentation of every line in a page and send it to the model.
But, i apply this code below by slightly modification seen here. The main problem comes here is that it crop the line of the image randomly and i save it serially as segment_no_1,2,3....
When i pass such segmented lines (randomly) to the model then it can not produce serial corresponding digital text.
Is there suitable method or algorithm to perform line segmentation with OpenCV serially as in Original Image. I have already found the line segmentation with deep learning but i not want to use that.
My code:
import cv2
import numpy as np
#import image
image = cv2.imread('input2.png')
#cv2.imshow('orig',image)
#cv2.waitKey(0)
#grayscale
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
cv2.imshow('gray',gray)
cv2.waitKey(0)
#binary
ret,thresh = cv2.threshold(gray,127,255,cv2.THRESH_BINARY_INV)
cv2.imshow('second',thresh)
cv2.waitKey(0)
#dilation
kernel = np.ones((5,100), np.uint8)
img_dilation = cv2.dilate(thresh, kernel, iterations=1)
cv2.imshow('dilated',img_dilation)
cv2.waitKey(0)
#find contours
im2,ctrs, hier = cv2.findContours(img_dilation.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
#sort contours
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[0])
for i, ctr in enumerate(sorted_ctrs):
# Get bounding box
x, y, w, h = cv2.boundingRect(ctr)
# Getting ROI
roi = image[y:y+h, x:x+w]
# show ROI
cv2.imshow('segment no:'+str(i),roi)
cv2.imwrite("segment_no_"+str(i)+".png",roi)
cv2.rectangle(image,(x,y),( x + w, y + h ),(90,0,255),2)
cv2.waitKey(0)
cv2.imwrite('final_bounded_box_image.png',image)
cv2.imshow('marked areas',image)
cv2.waitKey(0)
The segment_no_1.png
as first line segment can be found from middle or sometimes second last and so on.
So, what modification should need to find segmented lines correct order (serially) as in original image.
Any improvement on my code is also highly appreciated. Thanks in advance.
I think you should follow this where show that Sorting Contours using Python and OpenCV.
The basic steps that I follow are: