Search code examples
pythonhtmlseleniumbeautifulsouphtml-parsing

How can I get the code from the "View source" on the site page using BS4 or another library?


When we browse the site, we have the option to "View source" and "View page source". BS4 makes it possible to get data from the "View page source", is it possible to get data from the "View source"? If not, is there any other way to get them? I would really appreciate your help!


Solution

  • Solution:

    from selenium import webdriver
    import time
    from selenium.webdriver.chrome.options import Options
    
    chrome_options = Options()
    chrome_options.add_argument("--headless")
    driver = webdriver.Chrome(options=chrome_options)
    driver.get("my_URL")
    
    time.sleep(10)
    
    html_source = driver.page_source
    

    Using the headless option we launch the browser without displaying the window. A pause is needed for the entire javascript to be executed, otherwise the data we need will not have time to load. As a result, we get data that matches the data from the "View source".