I am attempting to go through a word document and find a few specific tables among many tables. I know how to iterate through all tables using either the docx library or win32, found here. However, I need to access a few specific tables, not all of them.
These tables have headings, in the format of Table A.x.x-x Insert table summary
. They are text headings above the tables, not within the tables themselves. These don't show up when I use doc.ListParagraphs
from win32, however, so I can't successfully iterate through the tables in that manner.
I know the name of the table I need to access. There is unrelated text throughout the document. There aren't any blanket similarities in the tables I need to find, so I can't just look for a specific value in a certain cell or something like that.
Does anyone have suggestions on how to approach this? Preferably using win32 COM, but I'm open to any solutions.
I figured out an answer, using this discussion. Thanks for the clarification on which win32 COM function to use!
From the discussion, I used the code for iter_block_items. I also made a list of all the table titles of the titles that I wanted, called listOfTables. I then used the following code, which outputs a dictionary, the keys being the title of the tables and the values being the tables themselves.
dox = docx.Document(path)
count = False
tables = {}
for item in iter_block_items(dox):
try:
title = item.text
if title in listOfTables:
count = True
except:
if count == True:
tables[str(title)] = item
count = False
print tables
If it comes upon a table, we go to the except case because a table has no attribute 'text'. Then, if count is true, aka if the previous paragraph contained a table title, then store the title and the table itself in a dictionary. This will pair the titles with the appropriate tables, and I'll have easy access to the table I need.