I am trying to extract all images from a PPTX file using python-pptx. I succeeded in doing this for images in shapes that have the picture style:
shape.shape_type == MSO_SHAPE_TYPE.PICTURE
but am struggling to extract images that have been added into the slide by setting them as the background image in an autoshape
shape.shape_type == MSO_SHAPE_TYPE.AUTO_SHAPE
Is there some way to extract the background image from an autoshape, or is this simply not possible through the API?
Unfortunately that's not currently possible via the API. You'd have to go to the underlying XML to get the element that looked like blipFill
or similar and use the rId
it has to get to the related image.
You can inspect the XML of the shape using:
print(shape._sp.xml)
Then you can use XPath to aquire the rId
value:
rId = shape._sp.xpath({xpath expr to rId attr})[0]
Once you have the rId
value you can acquire a reference to the ImagePart
using:
image_part = slide.part.related_part(rId)
Once you have the image part I expect you know what to do, getting things like image_part.image
: https://github.com/scanny/python-pptx/blob/master/pptx/parts/image.py#L21