Search code examples
web-scrapingserver-side

Is there any way to get the HTML of a lazy loading site with a server-side programming language?


I need to get the DOM tree of a site which implements lazy loading, i.e. the content is fetched with an AJAX call as soon as you scroll past a certain point with your browser. (Just to clarify, I need the DOM tree after the lazy loading function inserted the content)

I don't care if the solution is somehow messy or not stable since this is for a private project. And I also don't care about the technology involved either except for the fact that it has to be a server-side technology and available on linux servers without a graphical UI.

Any thoughts are welcome.


Solution

  • I'd suggest PhantomJS as simple scraping (cURL, wget etc) isn't going to be enough.