Search code examples
pythondjangoword-count

how to get a word count on word document in python?


I am trying to get the word counts of .doc .docx .odt and .pdf type files. This is pretty simple for .txt files but how can I go about doing a word count on the mentioned types?

I'm using python django on Ubuntu and trying to word count the documents words when a user uploads a file through the system.


Solution

  • First you need to read your .doc .docx .odt and .pdf.

    Second, count the words (<2.7 version).