I am trying to scrape the odds for each game at https://www.oddschecker.com/us/football. I do not see an obvious way to access any API when using the Chrome Tools XHR tab. Am I missing something here? Where is this data coming from?
I know I could scrape the data by loading the Javascript using Splash or Selenium (I am using Scrapy and python) but I am having major headaches with Splash that I can't seem to get any help with. I was hoping someone could show me a way to access the API so I could skip using these ways to load dynamic websites.
Any suggestions would be appreciated!
When you see the page source, data in that website is loaded from a script variable with id
initial-data
from bs4 import BeautifulSoup
import requests, json
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'}
r = requests.get('https://www.oddschecker.com/us/football', verify=False, headers=headers)
soup = BeautifulSoup(r.text,'lxml')
data = json.loads(soup.find("script", {"id":"initial-data"}).get_text(strip=True))
with open("data.json","w") as f:
json.dump(data,f)