Implementing picture zoom effect using Rmagick and FFmpeg

I have a picture and I need to get zoom effect on the resulting video. I almost get the desired result.. but. The resulting picture looks a bit shaky. It's because of rounding on cropping and resizing.. so centre of the picture shifts slightly with each conversion. What can i do with that? Or maybe there is some other method to implement it? In the input I have picture,zoom_type,zoom_percent,zoom_duration,scene_duration Here is part of the code which making the job:

img = Magick::ImageList.new(picture).first
width, height = img.columns.to_f, img.rows.to_f
img_fps = 30
if width >= height
  aspect_ratio = (width / height)
  zoom_small_size = ((height * (100 - zoom_percent)) / 100).to_f
  small_size = height
else
  aspect_ratio = (height / width)
  zoom_small_size = ((width * (100 - zoom_percent)) / 100).to_f
  small_size = width
end
factor = (((small_size - zoom_small_size) / (img_fps * zoom_duration))).to_f
while factor < 2
  img_fps -= 1
  factor = ((small_size - zoom_small_size) / (img_fps * zoom_duration))
end
total_images = img_fps * scene_duration
zoom_images = img_fps * zoom_duration_seed
new_width =  width
new_height =  height
zoom_changed_small_size = small_size

total_images.times do |i|
if zoom_images > 0 && zoom_changed_small_size > zoom_small_size
  img_n = img.crop(new_width, new_height, true)
  new_width = (width <= height) ? (new_width - factor).round : (new_width-factor*aspect_ratio).round
  new_height = (width >= height) ? (new_height-factor).round : (new_height-factor*aspect_ratio).round
  zoom_changed_small_size = (width >= height) ? img_n.rows : img_n.columns
  img_n.resize_to_fill!(width, height)
  img_n.write("#{sprintf("img_%04d.jpg" % (i+1))}")
  zoom_images -= 1
  img = img_n.copy if zoom_images == 0 || zoom_changed_small_size <= zoom_small_size
  img_n.destroy!
else
  img.write("#{sprintf("img_%04d.jpg" % (i+1))}")
  puts "Writing - #{img.filename}"
end
end

Then ffmpeg -y -f image2 -r 30 -i img_%04d.jpg -crf 0 -preset ultrafast -tune stillimage -pix_fmt yuv420p out.mp4

Solution

The simplest, brute-force approach you could take is to resize your starting images at the beginning of the process to be 3 or 4 times larger than they need to be at maximum zoom. That will reduce the effect of integer pixel rounding once you resize back down for the video frame, to the point where it is (hopefully) not easily visible or at least not distracting. The advantage of this approach is that you can keep most of your existing code as-is. The disadvantage is that you will need to experiment to get the right scaling factor, and depending on your source material and target video size you might end up working with some very large images.

If that simple patch is not to your liking, then it is possible to work with sub-pixel sampling zooms and pans in Image Magick . . .

I had a look at using affine_transform, but in practice it is fiddly. Instead, here is something using the distort method, which seems designed for your needs. This example takes an image, a set of points that define a "view rectangle" (which can all be floating point), and the target width, height of zoomed-in image (which should be integers):

def zoom_window image, from_left, from_top, from_right, from_bottom, to_width, to_height
  from_width = (from_right - from_left).to_f
  from_height = (from_bottom - from_top).to_f

  from_centre_x = 0.5 * ( from_left + from_right )
  from_centre_y = 0.5 * ( from_top + from_bottom )

  scale_x = to_width/from_width
  scale_y = to_height/from_height

  zoomed_and_scaled_image = image.distort( Magick::ScaleRotateTranslateDistortion,
    [ from_centre_x, from_centre_y, scale_x, scale_y, 0.0,
    0.5 * to_width, 0.5 * to_height]  ) { |i| i.define("distort:viewport", "#{to_width}x#{to_height}+0+0") }

  zoomed_and_scaled_image
end

I have tested the output of this progressively shrinking the from_ rectangle, then using a variation of your ffmpeg command - it resulted in a smooth zoom effect, coping with sub-pixel accuracy nicely, even on extreme zoom-ins (although of course these look blurry). To use it, you would need to calculate the Float co-ords for the zoom window, and call the above method (or your variation of) where currently you crop and resize img_n.

NB I have not made much attempt to make this Ruby code "nice", it's just a proof-of-concept.