I am trying to get the html of the following web: https://betway.es/es/sports/cpn/tennis/230 in order to get the matches' names and the odds with the code in python:
from bs4 import BeautifulSoup
import urllib.request
url = 'https://betway.es/es/sports/cpn/tennis/230'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
soup = str(soup)
But when I run the code it throws the next exception: HTTPError: HTTP Error 403: Forbidden
I have seen that maybe with headers could be possible, but I am completely new with this module so no idea how to use them. Any advice? In addition, although I am able to download the url, I cannot find the odds, anyone knows what can be a reason?
I'm unfortunately part of a country blocked by this site.
But, using the requests package:
import requests as rq
from bs4 import BeautifulSoup as bs
url = 'https://betway.es/es/sports/cpn/tennis/230'
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0"}
page = rq.get(url, headers=headers)
You can find your headers in F12 -> Networks -> random line -> Headers Tab
It is, as a result, a partial answer.