Search code examples
pythonperformancepython-pptx

PPTX-Python `insert_chart` incrementally slowing down


We have an application creating large pptx's with over 1000 slides and we are using python-pptx library.

The problem we have is that, as the Presentation grows it becomes slower to add Elements and/or charts to it.

from pptx import Presentation
from pptx.chart.data import CategoryChartData
from pptx.enum.chart import XL_CHART_TYPE
from pptx.util import Inches


SLD_LAYOUT_TITLE_AND_CONTENT = 1

prs = Presentation()

slide_layout = prs.slide_layouts[SLD_LAYOUT_TITLE_AND_CONTENT]
for idx in range(2000):
    slide = prs.slides.add_slide(prs.slide_layouts[5])
    
    chart_data = CategoryChartData()
    chart_data.categories = ['East', 'West', 'Midwest']
    chart_data.add_series('Series 1', (19.2, 21.4, 16.7))

    x, y, cx, cy = Inches(2), Inches(2), Inches(6), Inches(4.5)
    slide.shapes.add_chart(
    XL_CHART_TYPE.COLUMN_CLUSTERED, x, y, cx, cy, chart_data
    )

    print(str(idx))

prs.save('test.pptx')

I wonder if anyone has come across this situation before? It seems that pptx-python has to lookup inside the Presentation thus making it slower per iteration. Or is it the way we are using python to loop and load the variables into memory?


Solution

  • This appears to be an O(N^2) behavior in the chart and slide partname assignment. More details in the GitHub issue thread here: https://github.com/scanny/python-pptx/issues/644#issuecomment-685056215