Search code examples
pythonpdfpypdf

Update a fillable pdf using PyPDF2


I'm having trouble updating named fields in a fillable pdf. My code is as shown:

from PyPDF2 import PdfFileReader, PdfFileWriter

reader = PdfFileReader("invoice_template.pdf")
page = reader.getPage(0)

data_dict = {
    "business_name_1": "Consulting",
    "customer_name": "company.io",
    "customer_email": "[email protected]",
}

writer = PdfFileWriter()
writer.updatePageFormFieldValues(page, fields=data_dict)
writer.addPage(page)

with open("newfile.pdf", "wb") as fh:
    writer.write(fh)

I have checked the fields dictionary using myfile.getFormTextFields() before and after calling updatePageFormFieldValues() and they do get updated. However the generated pdf has none of the field values in it. Not sure what I'm doing wrong. The pdf I'm using can be found here


Solution

  • The problem is fixed by setting the NeedAppearances value of the PDF to True. This can be done by a function:

    def set_need_appearances_writer(writer: PdfFileWriter):
        # See 12.7.2 and 7.7.2 for more information: http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
        try:
            catalog = writer._root_object
            # get the AcroForm tree
            if "/AcroForm" not in catalog:
                writer._root_object.update({
                    NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)
                })
    
            need_appearances = NameObject("/NeedAppearances")
            writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
            # del writer._root_object["/AcroForm"]['NeedAppearances']
            return writer
    
        except Exception as e:
            print('set_need_appearances_writer() catch : ', repr(e))
            return writer
    

    Then, you can just add the line set_need_appearances_writer(writer) after the line writer = PdfFileWriter() and the form should be updated!

    You can view more information here: https://github.com/mstamy2/PyPDF2/issues/355

    Fixed code

    from PyPDF2 import PdfFileWriter, PdfFileReader
    from PyPDF2.generic import BooleanObject, NameObject, IndirectObject
    
    def set_need_appearances_writer(writer: PdfFileWriter):
        # See 12.7.2 and 7.7.2 for more information: http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
        try:
            catalog = writer._root_object
            # get the AcroForm tree
            if "/AcroForm" not in catalog:
                writer._root_object.update({
                    NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)
                })
    
            need_appearances = NameObject("/NeedAppearances")
            writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
            # del writer._root_object["/AcroForm"]['NeedAppearances']
            return writer
    
        except Exception as e:
            print('set_need_appearances_writer() catch : ', repr(e))
            return writer
    
    myfile = PdfFileReader("invoice_template.pdf")
    first_page = myfile.getPage(0)
    
    writer = PdfFileWriter()
    set_need_appearances_writer(writer)
    
    data_dict = {
                'business_name_1': 'Consulting',
                'customer_name': 'company.io',
                'customer_email': '[email protected]'
                }
    
    writer.updatePageFormFieldValues(first_page, fields=data_dict)
    writer.addPage(first_page)
    
    with open("newfile.pdf","wb") as new:
        writer.write(new)