Search code examples
pythonpywin32win32compython-docx

Update the TOC (table of content) of MS Word .docx documents with Python


I use the python package "python-docx" to modify the structure amd content of MS word .docx documents. The package lacks the possibility to update the TOC (table of content) [Python: Create a "Table Of Contents" with python-docx/lxml.

Are there workarounds to update the TOC of a document? I thought about using "win32com.client" from the python package "pywin32" [https://pypi.python.org/pypi/pypiwin32] or a comparable pypi package offering "cli control" capabilities for MS Office.

I tried the following:

I changed the document.docx to document.docm and implemented the following macro [http://word.tips.net/T000301_Updating_an_Entire_TOC_from_a_Macro.html]:

Sub update_TOC()

If ActiveDocument.TablesOfContents.Count = 1 Then _
  ActiveDocument.TablesOfContents(1).Update

End Sub

If i change the content (add/remove headings) and run the macro the TOC is updated. I save the document and i am happy.

I implement the following python code which should be equivalent to the macro:

import win32com.client

def update_toc(docx_file):
    word = win32com.client.DispatchEx("Word.Application")
    doc = word.Documents.Open(docx_file)
    toc_count = doc.TablesOfContents.Count
    if toc_count == 1:
        toc = doc.TablesOfContents(1)
        toc.Update
        print('TOC should have been updated.')
    else:
        print('TOC has not been updated for sure...')

update_toc(docx_file) is called in a higher-level script (which manipulates the TOC-relevant content of the document). After this function call the document is saved (doc.Save()), closed (doc.Close()) and the word instance is closed (word.Quit()). However the TOC is not updated.

Does ms word perform additional actions after macro execution which i did not consider?


Solution

  • Here is a snippet to update the TOC of a word 2013 .docx document which includes only one table of content (e.g. just TOC of headings, no TOC of figures etc.). If the script update_toc.py is run from the command promt (windows 10, command promt not "running as admin") using python update_toc.py the system installation of python opens the file doc_with_toc.docx in the same directory, updates the TOC (in my case the headings) and saves the changes into the same file. The document may not be opened in another instance of Word 2013 and may not be write-protected. Be aware of that this script does not the same as selecting the whole document content and pressing the F9 key.

    Content of update_toc.py:

    import win32com.client
    import inspect, os
    
    def update_toc(docx_file):
        word = win32com.client.DispatchEx("Word.Application")
        doc = word.Documents.Open(docx_file)
        doc.TablesOfContents(1).Update()
        doc.Close(SaveChanges=True)
        word.Quit()
    
    def main():
        script_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
        file_name = 'doc_with_toc.docx'
        file_path = os.path.join(script_dir, file_name)
        update_toc(file_path)
    
    if __name__ == "__main__":
        main()