Search code examples
pythonpypdf

Export to image not pdf in python PyPDF2 package


I have the following code that crops part of pdf file then save the output as PDF

from PyPDF2 import PdfFileWriter, PdfFileReader

with open("Sample.pdf", "rb") as in_f:
    input1 = PdfFileReader(in_f)
    output = PdfFileWriter()

    numPages = input1.getNumPages()
    print("Document Has %s Pages." % numPages)

    for i in range(1):
        page = input1.getPage(i)
        print(page.mediaBox.getUpperRight_x(), page.mediaBox.getUpperRight_y())
        page.trimBox.lowerLeft = (280, 280)
        page.trimBox.upperRight = (220, 200)
        page.cropBox.lowerLeft = (100, 720)
        page.cropBox.upperRight = (220, 800)
        output.addPage(page)

    with open("Output.pdf", "wb") as out_f:
        output.write(out_f)

How can I save as an image not as PDF? I found this code but the output is not at high quality. How can I improve the quality of the image output?

import fitz

pdffile = "Output.pdf"
doc = fitz.open(pdffile)
page = doc.loadPage(0)
pix = page.getPixmap()
output = "Output.jpg"
pix.writePNG(output)

Solution

  • Hi There You Could Use The pdf2image library for achieving so. You Could Use The Following Code At The End:

    from pdf2image import convert_from_path
    images = convert_from_path('Output.pdf')
    for i in range(len(images)):
        images[i].save('Output'+ str(i) +'.jpg', 'JPEG')
    

    Then If You Wish You Could Use The os library to delete the pdf you made using the following code in order to avoid the hassle of deleting the pdf yourself.

    import os
    os.remove("Output.pdf")