Search code examples
pythonpython-3.xgoogle-apigoogle-docsgoogle-api-python-client

How insert a rendered HTML template into Google Docs using Python


I have a challenge to insert a template, which is a HTML template, to a Google Docs programmatically using python. I know there is no native/built-in features to solve my problem in Google Docs Editor or Google Docs API, but I tried a few tricks to reach my objetive. Here we're ignoring "where" on document we should insert, just successfully insert is enough for now.

My approach was:

  1. Upload a HTML file in Google Drive as application/vnd.google-apps.document, because Google Docs convert the HTML to Docs automatically. (Not perfect, but works)
  2. Get the file content using Google Docs API get(), which is the Google Docs JSON format.
  3. Update the new content on target file using Google Docs batchUpdate().
def insert_template_to_file(target_file_id, content):
    media = MediaIoBaseUpload(BytesIO(content.encode('utf-8')), mimetype='text/html', resumable=True)
    body = {
        'name': 'document_test_html',
        'mimeType': 'application/vnd.google-apps.document',
        'parents': [DOC_FOLDER_ID]
    }

    try:
        # Create HTML as docs because it automatically convert html to docs
        content_file = driver_service.files().create(body=body, media_body=media).execute()
        content_file_id = content_file.get('id')

        # Collect html content from Google Docs after created
        doc = docs_service.documents().get(documentId=content_file_id, fields='body').execute()
        request_content = doc.get('body').get('content')

        # Insert the content from html to target file
        result = docs_service.documents().batchUpdate(documentId=target_file_id, body={'requests': request_content}).execute()
        print(result)

        # Delete html docs
        driver_service.files().delete(fileId=content_file_id).execute()
        print("Content inserted successfuly")
    except HttpError as error:
        # Delete html docs even if failed
        driver_service.files().delete(fileId=content_file_id).execute()
        print(f"An error occurred: {error}")

The problem is: The content which I collect from step 2 doesn't match what the batchUpdate() requires. I'm trying to adapt the content from step 2 to match the step 3, but no success yet.

The targeted solution: Get a string with a HTML code, insert the HTML rendered into a target file on Google Docs. The objective is append the HTML with the existing content of target file, not overwrite.

Did my approach make sense? Do you have any other idea to reach my goal?


Solution

  • I believe your goal is as follows.

    • You want to append HTML data to a Google Document by rendering the HTML.
    • You want to achieve this using googleapis for Python.

    Unfortunately, in the current stage, it seems that the JSON object retrieved by "Method: documents.get" cannot be directly used as the request body of "Method: documents.batchUpdate".

    But, if you want to append HTML to the existing Google Document, I thought that it could be achieved using only Drive API. When this is reflected in a sample script, how about the following sample script?

    Sample script:

    def insert_template_to_file(target_file_id, content):
        request = drive_service.files().export(fileId=target_file_id, mimeType="text/html")
        file = BytesIO()
        downloader = MediaIoBaseDownload(file, request)
        done = False
        while done is False:
            status, done = downloader.next_chunk()
            print("Download %d%%" % int(status.progress() * 100))
        file.seek(0)
        current_html = file.read()
    
        media = MediaIoBaseUpload(BytesIO(current_html + content.encode('utf-8')), mimetype='text/html', resumable=True)
        body = {'mimeType': 'application/vnd.google-apps.document'}
    
        try:
            result = drive_service.files().update(fileId=target_file_id, body=body, media_body=media).execute()
            print(result)
            print("Content inserted successfuly")
        except HttpError as error:
            print(f"An error occurred: {error}")
    
    • In this modified script, HTML data is retrieved from the existing Google Document, and the new HTML is appended to the retrieved HTML. And, the Google Document is updated by updated HTML. In your situation, it seems that the original data is HTML. So, I thought that this method might be able to be used.

    Note:

    • This script overwrites the Google Document of target_file_id. So, when you test this script, I would like to recommend using a sample Google Document.

    References: