I am reading from a word file using Python with many tables in the document. I need to extract data only from certain tables, depending on the sections they appear in. Is there any way to search through the file, reach a certain line, and read the table that appears after the line?
For example, if the word document is something like:
1
2
3
[table]
4
5
6
[table]
would I be able to read the table specifically after the '6'?
Reading the 'second table' would not work, because the number of tables that appear before that table is arbitrary; I need to read it because it appears after the '6'.
The code here may be of interest: https://github.com/python-openxml/python-docx/issues/276#issuecomment-199502885.
What you're looking for, I believe, is a way to iterate the block level items in a document, in the order they appear. A Word document has two types of block-level items, paragraphs and tables. The function at the link above allows you to iterate those in document order.