Search code examples
pythonpython-3.xopencvimage-processingmser

Extract text from image using MSER in Opencv python


I want to detect text in a image using mser and remove all non-text regions. Using the code below i was able to detect text:

import cv2
import sys


mser = cv2.MSER_create()
img = cv2.imread('signboard.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
vis = img.copy()
regions, _ = mser.detectRegions(gray)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]
cv2.polylines(vis, hulls, 1, (0, 255, 0))
cv2.imshow('img', vis)
if cv2.waitKey(0) == 9:
    cv2.destroyAllWindows()

How can I remove all non-text regions and get a binary image with text only? I searched a lot over but could not find any example code to do so using python and opencv.


Solution

  • You can get a binary image using the contours you found. Just draw the filled in contours to an empty img in white.

    mask = np.zeros((img.shape[0], img.shape[1], 1), dtype=np.uint8)
    for contour in hulls:
        cv2.drawContours(mask, [contour], -1, (255, 255, 255), -1)
    

    Note: See the official docs for more on drawContours

    You can then use this to extract only the text:

    text_only = cv2.bitwise_and(img, img, mask=mask)