Search code examples
pythonpypdf

How to update a field with PyPDF2


I'm trying to make a pdf generator and I'm almost there but can't figure out the final step of updating the form field.

I'm using PyPDF2 in a Windows environment with Python 3.6

The first step is to download the pdf (of which there are many, though they are all very similar and they all have the same form fields). The following code will then open the pdf and write a new one. My belief is that if I update the dictionary of form fields and write that dictionary to the new file then it will make the change I want. Problem is I can't work out how to put into effect the updated dictionary.

pdf = open(file, 'rb')
flObj = PdfFileReader(pdf)
flObj.decrypt(password)
dict = flObj.getFormTextFields()
writer = PdfFileWriter()
outputstream = open(my_file, 'wb')
dict['DB_Code'] = '2809785' #as an example
for i in range(flObj.getNumPages()):
    writer.addPage(flObj.getPage(i))
writer.write(outputstream)
outputstream.close()

I can see in the documentation of PyPDF2 that there is the updatePageFormFieldValues(page, fields) however the dictionary returned by the getFormTextFields function doesn't give the pages that it applies to (the fields are spread across 4 pages in the pdf always), so I'm not quite sure how to apply this.

I have looked at a number of other questions and solutions, such as this, however don't feel they fit my needs.

Thanks in advance.


Solution

  • So the answer would appear to be that I just had to look through the files and find the fields manually by page. Thankfully the fields aren't changing location between documents.

    There does appear to be a bug (with pdfs generally? maybe) where the pdf file is not redrawn. If one clicks on the field one can see the new text that PyPDF2 entered however one then has to manually copy and paste in order to see that change permanently.