Search code examples
pythondocxdocodt

Number of pages of a word document with Python


Is there a way to get efficiently the number of pages of a word document (.doc, .docx) with Python ?

And for an .odt file ?

I want to use this for a web application based on Web2py on Linux.

Thank you !


Solution

  • You can read the value

    <Properties>
    <Pages>CountValue</Pages>
    

    from docProps/app.xml in the docx package or

    <office:document-meta>
        <office:meta>
            <meta:document-statistic meta:page-count="CountValue">
    

    form meta.xml in odt package.

    If these values ​​do not exist (they are optional), you have to make a calculation of the entire document, in fact perform rendering, that much more difficult