Search code examples
pythondocx

Writing a for loop that creates separate docx files as it iterates using the python-docx module?


At the moment when I run my script it is able to create the doc and add the specified text just fine. It's worth noting that the text I'm adding to the doc is job listings parsed from html. The next thing I'm trying to figure is how I can have it so when I run my script it will iterate over each listing and for each respective listing create a separate docx file. I tried writing document = Document() into the for loop but that doesn't seem to work as it then only creates a doc for the first listing. Is this even possible?

import requests
from bs4 import BeautifulSoup
from docx import Document

document = Document()

for idx, item in enumerate(opps):
    title = item.find('h2').text
    description = item.find('p').text.strip()[0:]
    link = item.find("a").get("href")
    document.add_paragraph(
        (f'Title:{title}', f'Description: {description}\n', f'Link: {link}\n')
    )
    document.save('wordy.docx')

Solution

  • your document has always the same name, so in each iteration of the loop you overwrite the previous file. quickfix would be: document.save(f'{idx}_wordy.docx') also document = document should be inside the loop.