Search code examples
pythonseleniumdynamic-html

Scraping dynamically generated html by JavaScript with Python and Selenium


I currently have a problem with dynamically generated html code on this side:

http://www.economia-sniim.gob.mx/Nuevo/Home.aspx?opcion=Consultas/MercadosNacionales/PreciosDeMercado/Agricolas/ConsultaFrutasYHortalizas.aspx?SubOpcion=4|0

I would like to choose "Origen" and "Date" options in website, but I don't have all HTML code.

Could someone give me a hint, how to scrape all dynamically generated html code ?

Thanks,


Solution

  • The advantage with selenium is that you can actually start a browser session from your program and enable an event in javascript (like in this case scroll)

    In [8]: from bs4 import BeautifulSoup
    
    In [9]: from selenium import webdriver
    
    In [10]: driver = webdriver.Firefox()
    
    In [11]: driver.get('http://cavemendev.com')
    
    In [12]: html = driver.page_source
    
    In [13]: soup = BeautifulSoup(html)
    
    In [14]: driver.execute_script("window.scrollTo(0, Y)")
    
    In [15]: for tag in soup.find_all('title'):
       ....:     print tag.text
    

    Let me know if doesn't make much sense