My PowerPoint slide has a number of group shapes in which there are child text shapes.
Earlier I was using this code, but it doesn't handle Group shapes.
for eachfile in files:
prs = Presentation(eachfile)
textrun=[]
for slide in prs.slides:
for shape in slide.shapes:
if hasattr(shape, "text"):
print(shape.text)
textrun.append(shape.text)
new_list=" ".join(textrun)
text_list.append(new_list)
I am trying to extract the text from these child text boxes. I have managed to reach these child elements using GroupShape.shape But I get an error, that these are of type 'property', so I am not able to access the text or iterate (TypeError: 'property' object is not iterable) over them.
from pptx.shapes.group import GroupShape
from pptx import Presentation
for eachfile in files:
prs = Presentation(eachfile)
textrun=[]
for slide in prs.slides:
for shape in slide.shapes:
for text in GroupShape.shapes:
print(text)
I would then like to catch the text and append to a string for further processing.
So my question is, how to access the child text elements and extract the text from them.
I have spent a lot of time going though the documentation and source code, but haven't been able to figure it out. Any help would be appreciated.
I think you need something like this:
from pptx.enum.shapes import MSO_SHAPE_TYPE
for slide in prs.slides:
# ---only operate on group shapes---
group_shapes = [
shp for shp in slide.shapes
if shp.shape_type == MSO_SHAPE_TYPE.GROUP
]
for group_shape in group_shapes:
for shape in group_shape.shapes:
if shape.has_text_frame:
print(shape.text)
A group shape contains other shapes, accessible on its .shapes
property. It does not itself have a .text
property. So you need to iterate the shapes in the group and get the text from each of those.
Note that this solution only goes one level deep. A recursive approach could be used to walk the tree depth-first and get text from groups containing groups if there were any.
Also note that not all shapes have text, so you must check the .has_text_frame
property to avoid raising an exception on, say, a picture shape.