Search code examples
javaphpcomapache-poi

get word document count and number of slides count in PPT


Is it possible to get the number of pages in word document or number of slides in a ppt?

I have done a lot of research on it and I am desperately looking for solution. I saw that it is very difficult to get it done in PHP on linux server.

I would be ok with Java also, but is it possible. I checked the apache POI library, but would it work for both ppt, pptx, doc and docx?

I am rigorously searching for some solution but I am unable to get one. Any help would be really really appreciated.


Solution

  • To get meta data properties of doc, docx, ppt and pptx like number of pages, number of slides I followed the following process and it worked liked a charm. Hope it helps someone:

    1. Download and configure Apache Tika

    2. Once its done you could try executing the following commands, to get all the meta data about your file:

    java -jar tika-app-1.5.jar -m test.docx
    java -jar tika-app-1.5.jar -m test.doc
    java -jar tika-app-1.5.jar -m test.pptx
    java -jar tika-app-1.5.jar -m test.ppt
    

    Once tested you can execute these commands in a PHP script. Thanks.