python-3.x beautifulsoup python-requests python-os

files are saved repeatedly with single name, no looping, no ranging

My codes runs well, but have one flaw. They are not saving accordingly. For example, Let's say I caught 3 jpeg files, when I ran the codes, it saves 3 times on slot 1, 3 times on slot 2, and 3 times on slot 3. So I ended up with 3 same files.

I think there is something wrong with my looping logic? If I changed for n in range(len(soup_imgs)): to for n in range(len(src)):, the operation saves infinitely of the last jpeg files.

soup_imgs = soup.find(name='div', attrs={'class':'t_msgfont'}).find_all('img', alt="", src=re.compile(".jpg"))
for i in soup_imgs:
    src = i['src']
    print(src)

dirPath = "C:\\__SPublication__\\" 
img_folder = dirPath + '/' + soup_title + '/'
if (os.path.exists(img_folder)):
    pass
else:
    os.mkdir(img_folder)

for n in range(len(src)):
    n += 1
    img_name = dirPath + '/' + soup_title + '/' + str({}).format(n) + '.jpg'
    img_files = open(img_name, 'wb')
    img_files.write(requests.get(src).content)
    print("Outputs：" + img_name)

I am amateur in coding, just started not long ago as a hobby of mine. Please give me some guidance, chiefs.

Solution

Try this when you are writing your image files:

from os import path

for i, img in enumerate(soup_imgs):
    src = img['src']
    img_name = path.join(dirPath, soup_title, "{}.jpg".format(i))
    with open(img_name, 'wb') as f:
        f.write(requests.get(src).content)
    print("Outputs：{}".format(img_name))

You need to loop over all image sources, rather than using the last src value from a previous for block.

I've also added a safer method for joining directory and file paths that should be OS independent. Finally, when opening a file, always use the with open() as f: construct - this way Python will automatically close the filehandle for you.