pdf pdf-generation jpeg imagemagick-convert pdflatex

Storing jpg images into a pdf file in a "lossless" way

Given a directory with several jpg files (photos), I would like to create a single pdf file with one photo per page. However, I would like the photos to be stored in the pdf file unchanged; i.e., I would like to avoid decoding and recoding. So ideally I would like to be able to extract the original jpg files (maybe minus the metadata) from the pdf file, using, e.g., a linux command line too like pdfimages.

My ideas so far:

imagemagick convert. However, I am confused by the compression options: If I choose 100% quality, does it mean that the jpg is internally decoded, and then encoded lossless? (Which is obviously not what I want?)
pdflatex. Some people claim that the graphics package includes images lossless, while other dispute that. In any case, pdflatex would be slightly more cumbersome (I would first have to find out the dimensions of the photos, then set the page size accordingly, make sure that ther are no margins, headers etc etc).

Solution

You could use the following small script which relies on HexaPDF (note: I'm the author of HexaPDF) to do this.

Note: Make sure you have Ruby 2.4 installed, then run gem install hexapdf to install hexapdf.

Here is the script:

require 'hexapdf'

doc = HexaPDF::Document.new

ARGV.each do |image_file|
  image = doc.images.add(image_file)
  page = doc.pages.add
  iw = image.info.width.to_f
  ih = image.info.height.to_f                                                                                                                             
  pw = page.box(:media).width.to_f
  ph = page.box(:media).height.to_f
  rw, rh = pw / iw, ph / ih
  ratio = [rw, rh].min
  iw, ih = iw * ratio, ih * ratio
  x, y = (pw - iw) / 2, (ph - ih) / 2
  page.canvas.image(image, at: [x, y], width: iw, height: ih)
end

doc.write('images.pdf')

Just supply the images as arguments on the command line, the output file will be named images.pdf. Most of the code deals with centering and scaling the images to nicely fit onto the pages.