Search code examples
javascriptparsingpowerpoint

PowerPoint file structure


I'm trying to build a JavaScript parser for .ppt files. PPTX is no big deal since it' an "open" format, but I'm really lost regarding the file structure of a .ppt file and can't find any useful information.

Given this, has anyone ever tried this, or can at least point me to where I can see the 'spec' for the .ppt, so I can build the parser?

Best Regards, Celso Santos


Solution

  • .ppt is a binary file format. You can read the 1997-2007 spec here

    Not to discourage you from trying, but you should note that this may wind up being a daunting/almost impossible task for 1 developer to implement since the entire spec represents thousands of programming hours over 10 years.

    Joel Spolsky has a good article on dealing with these file formats.

    Just for completion sake, here is the spec for the pptx file format.