I have to extract all the text in a nested table (tables inside table inside table) from a word document. I'm unable to do it using the python-docx, maybe my lack of knowledge.
Please suggest some code examples.
You will want some sort of recursion. The basic idea is:
def iter_paragraphs_of_tables(tables):
for table in tables:
for row in table.rows:
for cell in row.cells:
yield from cell.paragraphs
yield from iter_paragraphs_of_tables(cell.tables)
for paragraph in iter_paragraphs_of_tables(document.tables):
print(paragraph.text)
This is Python3, if you're on Python2 you'll need to expand the yield from
statements into, for example:
yield from cell.paragraphs
# --- becomes ---
for paragraph in cell.paragraphs:
yield paragraph