Search code examples
pythonpythonmagick

How to handle multi-page images in PythonMagick?


I want to convert some multi-pages .tif or .pdf files to individual .png images. From command line (using ImageMagick) I just do:

convert multi_page.pdf file_out.png

And I get all the pages as individual images (file_out-0.png, file_out-1.png, ...)

I would like to handle this file conversion within Python, unfortunately PIL cannot read .pdf files, so I want to use PythonMagick. I tried:

import PythonMagick
im = PythonMagick.Image('multi_page.pdf')
im.write("file_out%d.png")

or just

im.write("file_out.png")

But I only get 1 page converted to png. Of course I could load each pages individually and convert them one by one. But there must be a way to do them all at once?


Solution

  • ImageMagick is not memory efficient, so if you try to read a large pdf, like 100 pages or so, the memory requirement will be huge and it might crash or seriously slow down your system. So after all reading all pages at once with PythonMagick is a bad idea, its not safe. So for pdfs, I ended up doing it page by page, but for that I need to get the number of pages first using pyPdf, its reasonably fast:

    pdf_im = pyPdf.PdfFileReader(file('multi_page.pdf', "rb"))
    npage = pdf_im.getNumPages()
    for p in npage:
        im = PythonMagick.Image('multi_page.pdf['+ str(p) +']')
        im.write('file_out-' + str(p)+ '.png')