I wrote this code to find all firms links, but it finds only first two, then it stops. Any idea why and how can I change it?
import requests
from bs4 import BeautifulSoup
url = "https://www.gelbeseiten.de/branchen/rechtsanwalt/mannheim"
req = requests.get(url)
src = req.text
soup = BeautifulSoup(src, "lxml")
all_firmas = soup.find_all("article", class_="mod mod-Treffer")
for i in all_firmas:
i_2 = i.next_element.next_element
print(i_2.get("href"))
print("Category done!")
Following your link, only two articles have the class "mod mod-Treffer". The other articles have the class "mod mod-Treffer mod-Treffer--kurz"
The following code also get the other articles using regex (import re
).
all_firmas = soup.find_all("article", class_=re.compile("mod mod-Treffer.+"))