I have a docx file at my aws-s3 bucket. I need to read it use python-docx. I write this:
from docx import Document
document = Document('https://my-first-backup-bucket-v1.s3-ap-southeast-1.amazonaws.com/New+Proposed+Quote.docx')
then, have error.. PackageNotFoundError: Package not found at 'https://my-first-backup-bucket-v1.s3-ap-southeast-1.amazonaws.com/New+Proposed+Quote.docx'
why?
when I tried to access the same file from browser it is opening successfully. for testing purpose I created this file with public access anyone can test this, can anyone please help on this?
From Document objects — python-docx 0.8.10 documentation:
docx.Document(docx=None)
Return a Document object loaded from docx, where docx can be either a path to a .docx file (a string) or a file-like object. If docx is missing or None, the built-in default document “template” is loaded.
It is saying that the supplied filename should point to a local file. It does not say that a URL is accepted.
Therefore, you should download the file from Amazon S3, then point to it on the local file system.