Search code examples
rtesseractmagick

reading text portion from list of images and saving in R, using magick


Here is the code I'm using for reading a text from multiple images in my list but after image resizing I want to save the output into separate folders with exact name.

library(magick)
library(magrittr)
    test <- image_read(list) %>%
      image_crop("100x16+161+68")%>%
      image_resize("2000") %>%
      image_convert() %>%      
      image_trim() %>%
      image_ocr()
    cat(test)

As far as I figured out, there is something I could do to save image, unique names with "image_write". I would be thankful for any suggestions and help, and hope it would be very helpful for new users as well. If it's possible I need to create a batch of about 100 images for a large data set.

 image_write(list, path = "/data/backup", format = "png") %>%

Solution

  • This is one way:

    library(magick)
    library(purrr)
    
    save_image <- function(img, img_name, output_dir) {
      image_write(img, file.path(output_dir, basename(img_name)))
      img
    }
    
    fils <- list.files("/tmp/so", pattern="png$", full.names = TRUE)
    
    map(fils, ~{
      curr_fil <- .x
      image_read(curr_fil) %>%
        image_crop("100x16+161+68") %>%
        image_resize("2000") %>%
        save_image(curr_fil, "/tmp/backup") %>% 
        image_convert() %>%      
        image_trim() %>%
        image_ocr()
    }) -> ocr_result
    

    There's no real need to make a function but it makes the pipes cleaner. This way you can have a pipe element with a side effect but keep processing.