I would like to acquire some pictures from a forum. The find_all results gives me most what I want, which are jpeg files. However It also gives me few gif files which I do not desire. Another problem is that the gif file is an attachment, not a valid link, and it causes trouble when I save files.
soup_imgs = soup.find(name='div', attrs={'class':'t_msgfont'}).find_all('img', alt="")
for i in soup_imgs:
src = i['src']
print(src)
I tried to avoid that gif files in my find_all selections search, but useless, both jpeg and gif files are in the same section. What should I do to filter my result then? Please give me some help, chief. I am pretty amateur with coding. Playing with Python is just a hobby of mine.
You can filter it via regular expression.Please refer the following example.Hope this helps.
import re
from bs4 import BeautifulSoup
data='''<html>
<body>
<h2>List of images</h2>
<div class="t_msgfont">
<img src="img_chania.jpeg" alt="" width="460" height="345">
<img src="wrongname.gif" alt="">
<img src="img_girl.jpeg" alt="" width="500" height="600">
</div>
</body>
</html>'''
soup=BeautifulSoup(data, "html.parser")
soup_imgs = soup.find('div', attrs={'class':'t_msgfont'}).find_all('img', alt="" ,src=re.compile(".jpeg"))
for i in soup_imgs:
src = i['src']
print(src)