Search code examples
pythonpdfimagemagickimagemagick-convertwand

imagemagick wand save pdf pages as images


I would like to use imagemagick Wand package to convert all pages of a pdf file into a single image file. I am having the following trouble though (see comments below which highlight problem)

import tempfile
from wand.image import Image


with file('my_pdf_with_5_pages.png') as f:
    image = Image(file=f, format='png')
    save_using_filename(image)
    save_using_file(image)

def save_using_filename(image):
    with tempfile.NamedTemporaryFile() as temp:
        # this saves all pages, but a file for each page (so 3 files)
        image.save(filename=temp.name)

def save_using_file(image):
    with tempfile.NamedTemporaryFile() as temp:
        # this only saves the first page as an image
        image.save(file=temp)

My end goal it to be able to specify which pages are to be converted to one continual image. This is possible from the command line with a bit of

convert -append input.pdf[0-4]

but I am trying to work with python.

I see we can get slices by doing this:

[x for x in w.sequence[0:1]] # get page 1 and 2

now its a question of how to join these pages together.


Solution

  • A slight simplification of @rikAtee's answer / addition of detecting the page count automatically by counting the length of the sequence:

    def convert_pdf_to_png(blob):
        pdf = Image(blob=blob)
    
        pages = len(pdf.sequence)
    
        image = Image(
            width=pdf.width,
            height=pdf.height * pages
        )
    
        for i in xrange(pages):
            image.composite(
                pdf.sequence[i],
                top=pdf.height * i,
                left=0
            )
    
        return image.make_blob('png')
    

    I haven't noticed any memory link issues, although my PDFs only tend to be 2 or 3 pages.