Search code examples
powerpoint

Creating a power-point file reader ( pptx )


I'm looking for an open-source pptx reader (preferred in C# ) to modify it and put it into a 3D engine ( customer request ), or at least a tutorial on the basics. I already searched on google but can't find any useful resources.

I know it's possible to create a new reader by reading the pptx files documentation ( ECMA ), but this seems to be a huge project anyway and I would prefer if I'm able to build this component on some existing code.


Solution

  • Your options

    The best option to use really depends on the modifications you need to do. If you want to be able to heavily manipulate the PowerPoint presentation, draw new shapes, rotate shapes, add charts, add slides or master slides etc. you may find an abstraction layer like the Aspose.Slides library (proprietary) very useful.

    If you do not want to pay for the library, the OpenXML formats are available to you in .NET. They allow you to manipulate every aspect of the PPTX document without the need for Interop/COM as they parse the XML inside the PPTX.

    From personal experience, having used both, Aspose is a far easier solution but one that provides some overhead and of course has a cost. The OpenXML route is light to use, but requires some learning curve.

    Last but not least, you can take a look at NetOffice which achieves something similar to Aspose, it is a little lighter and has reduced functionality. It also covers other formats and does not require Office installs on the machine.

    To sum your options:

    My advice

    If you need to do some simple modifications (e.g. extract a slide, change a bit of text somewhere, replace an image) I would go with OpenXML.

    If you want to draw slides in a bespoke manner, I would go with Aspose. I have used Aspose in a 50.000 LoC application to build hundreds of thousands of PowerPoint decks (up to a 100 slides at times) using WCF. Aspose has been drawing each slide and generating all the shapes. It takes about 4-5 seconds for a deck to be generated. The loading of Aspose and small issues with Aspose can be irritating (one can process a slide in around 200ms). Also Aspose presentations are not serializable, which is annoying if you want to cache the results in some form.

    If you want to read the PPTX and somehow convert it to images, Aspose.Slides is a good candidate because it allows you to convert a PPTX slide to SVG which you can subsequently process. Note that there are some PPTX2SVG engines out there (XSLT) but the ones I know are written in Java (Apache).

    Notes

    The libraries I mentioned are all libraries for the .NET/C# environment. None of these libraries and techniques require office/interop/com installs.