Search code examples
pythonweasyprintgoogle-bucket

Python Weasyprint to Google Bucket


I am using Google Functions in order to generate PDFs.

I want to store the PDFs in a Google Bucket.

I know that I can store PDFs as a file using the following code:

# Write PDF to HTML
pdf = "<html><title>Hello</title><body><p>Hi!</p></body></html>"

# HTML to PDF at local disk
document = weasyprint.HTML(string=pdf, encoding='UTF-8')
document.write_pdf(f"Hello.pdf")

However I want to store it in a Google Bucket, so I have tried the following code :

# Write PDF to HTML
pdf = "<html><title>Hello</title><body><p>Hi!</p></body></html>"

# HTML to PDF in Google Bucket
document = weasyprint.HTML(string=pdf, encoding='UTF-8')
client = storage.Client()
bucket = client.get_bucket("monthly-customer-reports")
blob = bucket.blob("Hello.pdf")
with blob.open("w") as f:
    f.write(str(document))

This stored a PDF in my Google Bucket but it was invalid.


Solution

  • You are trying to write the string representation of the document object to the file, but this is not a PDF binary data, what you could do is convert to convert to binary then write it directly to Google Cloud storage.

    from google.cloud import storage
    import weasyprint
    
    pdf = "<html><title>Hello</title><body><p>Hi!</p></body></html>"
    
    document = weasyprint.HTML(string=pdf, encoding='UTF-8')
    pdf_bytes = document.write_pdf()
    
    client = storage.Client()
    bucket = client.get_bucket("monthly-customer-reports")
    blob = bucket.blob("Hello.pdf")
    blob.upload_from_string(pdf_bytes, content_type='application/pdf')