Search code examples
pythonseleniumweb-scrapingdynamic-html

Fastest way to get dynamically updated HTML content from website?


I want to scrape a stocks website and get the prices using selenium. I can't use normal HTML requests as the HTML is dynamic. I am using the headless selenium webdriver to get the data, but it takes around 30 seconds for each request. Is there a faster way to get the dynamic HTML?


Solution

  • No, you are stuck with Selenium's wait-time with rendering

    Dynamic HTML requires a full browser. There is not much negotiating with that. If your pages are separate and distinct, i.e. you are scraping stocks.com/oilandgas as well as stocks.com/agriculture, there is a possible way to speed things up.

    The one option you might have is to create a separate thread for each Selenium Webdriver instance and have both web pages scraped at the same time by two different Selenium Webdrivers.

    The caveat to that is that it will only speed things up if the bottleneck (what is causing the slowness) is the rendering of the website.

    If it is the internet speed, the processing power of your computer, or the server speed of the website, this would not improve things.

    Actually, Daniel Farrell below suggests that it would improve the networking speed. You may want to give this a shot.