Search code examples
pythonpython-3.xbase64pypdfpython-pdfreader

How to store PDF in MySQL database without generating PDF file in Python


So basically I have a base64 encoded PDF data in MySQL database, And I want to manipulate that data ( Update the form fields of PDF file data), after that without creating/Write a PDF file I want to store that manipulated/updated data into a Database. The Python code as given below.

Here I am using PyPDF2 and code is working

import base64, io, PyPDF2

try:
    data_dict = '{"firstName": "John", "lastName": "Joe"}'
    encodedDataOfPDF = base64.b64decode(data)  #base64 encoded data of pdf from database

    file = io.BytesIO(encodedDataOfPDF)
    pdfReader = PyPDF2.PdfFileReader(file)
    pdfWriter = PyPDF2.PdfFileWriter()
    pdfWriter.appendPagesFromReader(pdfReader)

    #Here form fields of PDF gets updated.
    pdfWriter.updatePageFormFieldValues(pdfWriter.getPage(0), data_dict)  


    #If I uncomment below code then it will create a PDF file with updated data.
    #But I Don't want a PDF File, 
    #I just need the base64 encoded data of that updated/manipulated file which I will store in the Database.

    # with open(data[1], 'wb') as f:
    #     pdfWriter.write(f)


except Exception as e:
    app.logger.info(str(e))

Note: Please also read the comments in the code

Thanks in advance.


Solution

  • After researching a lot finally, I get the proper way to get the updated/manipulated encoded data known as a stream.

    import base64, io, PyPDF2
    
    try:
        tempMemory = io.BytesIO() #Added BytesIO
        data_dict = '{"firstName": "John", "lastName": "Joe"}'
        encodedDataOfPDF = base64.b64decode(data)  #base64 encoded data of pdf from database
    
        file = io.BytesIO(encodedDataOfPDF)
        pdfReader = PyPDF2.PdfFileReader(file)
        pdfWriter = PyPDF2.PdfFileWriter()
        pdfWriter.appendPagesFromReader(pdfReader)
    
        #Here form fields of PDF gets updated.
        pdfWriter.updatePageFormFieldValues(pdfWriter.getPage(0), data_dict)  
    
        pdfWriter.write(tempMemory)
        newFileData = tempMemory.getvalue()
        newEncodedPDF= base64.b64encode(newFileData) # Here I get what I want.
    
    
    except Exception as e:
        app.logger.info(str(e))
    

    I got base64 encoded data without generating a PDF file.

    Thank you