I'm trying to scrape http://www.betexplorer.com/soccer/england/premier-league-2016-2017/results/ website links and then add the links to the empty list.
Here is my code:
from bs4 import BeautifulSoup
import requests
l = []
r = requests.get("http://www.betexplorer.com/soccer/england/premier-league-2016-2017/results/")
c=r.content
soup=BeautifulSoup(c,"html.parser")
for link in soup.find_all("a",{"class":"in-match"}):
href=link.get('href')
l.append(href)
print(l[0])
And now my result is when I'm trying to print the first link of the website:
/soccer/england/premier-league-2016-2017/arsenal-everton/SGPa5fvr/
/soccer/england/premier-league-2016-2017/arsenal-everton/SGPa5fvr/
/soccer/england/premier-league-2016-2017/arsenal-everton/SGPa5fvr/
/soccer/england/premier-league-2016-2017/arsenal-everton/SGPa5fvr/
.................
The problem is that when I try to print out the specific link of the website, the link is printing out many times and it should come out only one time.
You have made a simple logical error. Your print statement currently is inside the loop. Taking it out of the loop scope will fix your issue.
Fixed version:
for link in soup.find_all("a",{"class":"in-match"}):
href=link.get('href')
l.append(href)
print(l[0])
After loop execute, l
array will be filled with links