Search code examples
pythonpython-3.xparagraph

Detect paragraph break and put it in new variable in Python 3


I have a docx file, I opened it in PyCharm using textract. The docx contains a text with multiple paragraphs. What I want to do is detect every paragraph break and put every paragraph in a separate variables or as a list as string to use for later?

How can I do that in Python 3?

Please help!

I haven't anything on the same.


Solution

  • You can achieve that by using Document from docx

    from docx import Document
    document = Document('path/to/your/file.docx')
    paragraphs = [para.text for para in document.paragraphs]