Search code examples
pythonscreen-scrapingclicking

Clicking two consecutive buttons while scraping a website with selenium in python


I am trying to scrape the country information from the website below, https://www.morningstar.com/etfs/xnas/vnqi/portfolio which entails clicking the 'Country' selection in the Exposure section, then moving through the 1, 2,3, etc. pages using the arrows at the bottom of the section. Nothing I have tried seems to work. Is there a way to do it using selenium in Python?

Many thanks!

Here is the code I used:

    urlpage   = 'https://www.morningstar.com/etfs/xnas/vnqi/portfolio'
    driver = webdriver.Chrome(options=options, executable_path='D:\Python\Python38\chromedriver.exe')
    driver.get(urlpage)
    elements=WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//a[text()='Country']")))
    for elem in elements:
        elem.click()

and this is the error message:

TimeoutException                          

Traceback (most recent call last)  
<ipython-input-3-bf16ea3f65c0> in <module>  
    23 driver = webdriver.Chrome(options=options, executable_path='D:\Python\Python38\chromedriver.exe')  
     24 driver.get(urlpage)  
---> 25 elements=WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//a[text()='Country']")))  
     26 for elem in elements:  
     27      elem.click()  
D:\Anaconda\lib\site-packages\selenium\webdriver\support\wait.py in until(self, method, message)  
     78             if time.time() > end_time:  
     79                 break  
---> 80         raise TimeoutException(message, screen, stacktrace)  
     81   
     82     def until_not(self, method, message=''):  
TimeoutException: Message: 

Sorry, not sure how to format the error message better. Thanks again.


Solution

  • It seems you didn't check what you really have in HTML. So you didn't do the most important thing.

    There is NO <a> with text Country on this page.

    There is <input> with value="Country"


    This code works for me

    import time
    from selenium import webdriver
    
    url = 'https://www.morningstar.com/etfs/xnas/vnqi/portfolio'
    
    driver = webdriver.Chrome()
    driver.get(url)
    
    time.sleep(2)
    
    country = driver.find_element_by_xpath('//input[@value="Country"]')
    country.click()
    
    time.sleep(1)
    next_page = driver.find_element_by_xpath('//a[@aria-label="Go to Next Page"]')
        
    while True:
        
        # get data
        table_rows = driver.find_elements_by_xpath('//table[@class="sal-country-exposure__country-table"]//tr')
        for row in table_rows[1:]:  # skip header 
            elements = row.find_elements_by_xpath('.//span')  # relative xpath with `.//`
            print(elements[0].text, elements[1].text, elements[2].text)
    
        # check if there is next page
        disabled = next_page.get_attribute('aria-disabled')
        #print('disabled:', disabled)
        if disabled:
            break
    
        # go to next page        
        next_page.click()
        
        time.sleep(1)
    

    Result

    Japan 22.08 13.47
    China 10.76 1.45
    Australia 9.75 6.05
    Hong Kong 9.52 6.04
    Germany 8.84 5.77
    Singapore 6.46 4.33
    United Kingdom 6.22 5.77
    Sweden 3.48 2.00
    France 3.18 2.58
    Canada 2.28 2.92
    Switzerland 1.78 0.69
    Belgium 1.63 1.31
    Philippines 1.53 0.15
    Israel 1.47 0.16
    Thailand 0.98 0.09
    India 0.87 0.11
    South Africa 0.87 0.21
    Taiwan 0.83 0.08
    Mexico 0.80 0.33
    Spain 0.62 0.84
    Malaysia 0.54 0.08
    Brazil 0.52 0.06
    Austria 0.51 0.16
    New Zealand 0.41 0.21
    Indonesia 0.37 0.02
    Norway 0.37 0.29
    United States 0.29 44.09
    Netherlands 0.24 0.19
    Chile 0.21 0.01
    Ireland 0.16 0.19
    South Korea 0.15 0.00
    Turkey 0.08 0.02
    Russia 0.08 0.00
    Finland 0.06 0.16
    Poland 0.05 0.00
    Greece 0.05 0.00
    Italy 0.02 0.05
    Argentina 0.00 0.00
    Colombia 0.00 0.00
    Czech Republic 0.00 0.00
    Denmark 0.00 0.00
    Estonia 0.00 0.00
    Hungary 0.00 0.00
    Latvia 0.00 0.00
    Lithuania 0.00 0.00
    Pakistan 0.00 0.00
    Peru 0.00 0.00
    Portugal 0.00 0.00
    Slovakia 0.00 0.00
    Venezuela 0.00 0.00