I have a table that contains some text with upper line, the upper line changes the meaning of the text, I want to be able to determine for each cell if it contains a line shape or not.
From what I saw, there's a cell.part.inline_shapes but it gives the same results for each cell in the table, and it the doesn't specify the actual shape (line/rectangular/square etc.).
e.g. in the following table, only cell [1, 0] is containing line
def is_line(shape):
#TODO implement
pass
def is_containing_line(cell):
# TODO: check if shape is in current cell, as cell.part.inline_shapes are the same in every table cell
cell_shapes = cell.part.inline_shapes
return any(is_line(shape) for shape in cell_shapes)
[i for i, cell in enumerate(table.columns[column_index].cells[starting_row:])
if is_containing_line(cell)]
print(cell._tc.xml) for cell that contains line shape:
<w:tc xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml"
xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<w:tcPr>
<w:tcW w:w="1734" w:type="dxa" />
<w:tcBorders>
<w:top w:val="single" w:sz="12" w:space="0" w:color="000000" />
<w:bottom w:val="nil" />
</w:tcBorders>
</w:tcPr>
<w:p>
<w:pPr>
<w:pStyle w:val="TableParagraph" />
<w:spacing w:before="10" />
<w:rPr>
<w:b/>
<w:sz w:val="2" />
</w:rPr>
</w:pPr>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="TableParagraph" />
<w:spacing w:line="20" w:lineRule="exact" w:before="0" />
<w:ind w:left="755" />
<w:rPr>
<w:sz w:val="2" />
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:sz w:val="2" />
</w:rPr>
<w:pict>
<v:group style="width:11.1pt;height:.6pt;mso-position-horizontal-relative:char;mso-position-vertical-relative:line" coordorigin="0,0" coordsize="222,12">
<v:rect style="position:absolute;left:0;top:0;width:222;height:12" filled="true" fillcolor="#000000" stroked="false">
<v:fill type="solid" />
</v:rect>
</v:group>
</w:pict>
</w:r>
<w:r>
<w:rPr>
<w:sz w:val="2" />
</w:rPr>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="TableParagraph" />
<w:spacing w:before="0" />
<w:ind w:left="453" w:right="449" />
<w:jc w:val="center" />
<w:rPr>
<w:sz w:val="16" />
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:sz w:val="16" />
</w:rPr>
<w:t>EN</w:t>
</w:r>
</w:p>
</w:tc>
print(cell._tc.xml) for cell that doesn't contain line shape:
<w:tc xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml"
xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<w:tcPr>
<w:tcW w:w="1734" w:type="dxa" />
</w:tcPr>
<w:p>
<w:pPr>
<w:pStyle w:val="TableParagraph" />
<w:spacing w:before="55" />
<w:ind w:left="453" w:right="448" />
<w:jc w:val="center" />
<w:rPr>
<w:sz w:val="16" />
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:sz w:val="16" />
</w:rPr>
<w:t>BOUT</w:t>
</w:r>
</w:p>
</w:tc>
There is no API support for this in python-docx
.
However, this function will tell you whether a drawing (inline-shape) is present in a paragraph. Note that depending on the Word version, such an item may appear as a <w:pict>
(bitmap image) element instead of a <w:drawing>
(vector art) element:
def has_inline_shape(paragraph):
"""Return True if `paragraph` contains an inline shape."""
return (
bool(paragraph._p.xpath(".//w:drawing"))
or bool(paragraph._p.xpath(".//w:pict"))
)
You can apply it to each paragraph in a cell to determine whether the cell contains such a shape:
def cell_contains_inline_shape(cell):
"""Return True if an inline-shape appears in `cell`."""
return any(
has_inline_shape(paragraph)
for paragraph in cell.paragraphs
)