Search code examples
pdfimagemagickimagemagick-convertpdftk

How can I extract images from a PDF in linux while preserving transparency?


I've tried using pdfextract to extract images from a PDF and while it does extract the images I want, it extracts them with a black background. However, it also extracts a "mask" image, which I believe is the alpha channel.

enter image description here

I've read through http://www.imagemagick.org/Usage/masking, but I see no example for applying an already-extracted mask to an existing image to restore transparency. Is there a way to do this using imagemagick? If not, is there an easier way to extract images from a pdf while preserving transparency?


Solution

  • I just found the answer from this post:

    convert extracted-image.png extracted-image-mask.png -alpha off -compose copy-opacity -composite bug.png
    

    If anyone's interested, I made a little script to do all the steps at once: https://gist.github.com/bendavis78/ed22a974c2b4534305eabb2522956359