Search code examples
rocrtesseractbounding-box

Why are the Bounding Boxes from Tesseract not aligned on the image text?


I'm using the tesseract R package to recognize text within an image file. However, when plotting the bounding box for a word, the coordinates don't seem to be right.

  1. Why is the bounding box for the word "This" not aligned with the text "This" in the image? Output
  2. Is there an easier way to plot all bounding box rectangles on the image?
library(tesseract)
library(magick)
library(tidyverse)

text <- tesseract::ocr_data("http://jeroen.github.io/images/testocr.png")
image <- image_read("http://jeroen.github.io/images/testocr.png")

text <- text %>% 
  separate(bbox, c("x1", "y1", "x2", "y2"), ",") %>% 
  mutate(
    x1 = as.numeric(x1),
    y1 = as.numeric(y1),
    x2 = as.numeric(x2),
    y2 = as.numeric(y2)
  )

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = text$y1[1], 
  xright = text$x2[1], 
  ytop = text$y2[1])

Solution

  • This is simply because the x, y co-ordinates of images are counted from the top left, whereas rect counts from the bottom left. The image is 480 pixels tall, so we can do:

    plot(image)
    rect(
      xleft = text$x1[1], 
      ybottom = 480 - text$y1[1], 
      xright = text$x2[1], 
      ytop = 480 - text$y2[1])
    

    enter image description here

    Or, to show this generalizes:

    plot(image)
    
    rect(
      xleft = text$x1, 
      ybottom = magick::image_info(image)$height - text$y1, 
      xright = text$x2, 
      ytop = magick::image_info(image)$height - text$y2,
      border = sample(128, nrow(text)))
    

    enter image description here