I am trying to get data from tables (which includes nested tables as well) in a .docx document. However my current code which looks like:
def pctnt():
tables = doc.tables
for table in tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
print(paragraph.text)
for table in cell.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
print(paragraph.text)
for table in cell.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
print(paragraph.text)
It works OK for my current .docx as I know how many nested tables there would be.
However, this is not going to be the case when I have other documents coming in, therefore I need a way to retrieve the data from nested tables no matter how many are there in the document.
NEW QUESTION based on the solution given by @Boendal
Is it possible for me to print the data into a list so i can print a beautified table using pandas or search for a specific table cell?
With the description you gave and your code fragment this should work:
def print_paragraphs(doc):
for table in doc.tables:
for row in table.row:
for cell in row.cells:
for paragraph in cell.paragraphs:
print(paragraph.text)
print_paragraphs(cell)
print_paragraphs(doc)