I have the following methods as part of a class:
def download_page(self, page: namedtuple):
r = requests.get(page.link)
r.raise_for_status()
with open(f'{page.number}.jpg', 'wb') as f:
f.write(r.content)
def download_chapter(self, chapter: namedtuple):
try:
os.mkdir(chapter.name)
except FileExistsError:
print("This folder already exists. It will be overwritten.")
os.chdir(chapter.name)
page_list = self.get_pages(chapter.link)
with concurrent.futures.ProcessPoolExecutor(max_workers=2) as executor:
executor.map(self.download_page, page_list)
os.chdir('..')
The problem is that when I call the download_chapters() function from my main file, all the directories are created but they are empty. The actual images that should be saved when the executor runs are nowhere to be seen. Also, the whole thing ends really fast, so I'm guessing the executor is not working at all. I have other script that uses the ProcessPoolExecutor function in a very similar way and it works as expected, so I have no idea what I'm missing.
Also, if I replace the executor part with this:
for _ in page_list:
self.download_page(_)
Everything works properly, so my other functions are doing their job.
executor.map()
returns an iterator.
You'll need to iterate over it somehow for it to do work; if you don't need the results, just
list(executor.map(self.download_page, page_list))
to have a list created of the results and subsequently thrown away.
If you don't want to gather that list of None
s,
for _ in executor.map(self.download_page, page_list):
pass
does just as well.