Search code examples
pythonseleniumxpathcss-selectorswebdriverwait

scraping SVG data by CSS Selector and Id (Selenium)


I'm looking to scrape a label from an SVG that only arrives with a mouse hover. I'm working with this link for the data contained with the [+] expand button to the right in each of the table rows. When you press [+] expand, an SVG table pops up that shows elements that contain elements. When you hover on each of the elements, a element appears called "Capacity Impact" with a value for each of the bars. These values are the values I want to scrape.

See a screenshot below.

enter image description here

So far, my code is successful in opening each of the [+] expand buttons, and identifying the polygons but I can't get to the labels using either XPATH or CSS Selectors. See code below.


driver.get(url)
table_button_xpath = "//table[@class='data-view-table redispatching dataTable']//tr//td[@class = 'button-column']//a[@class='openIcon pre-table-button operation-detail-expand small-button ui-button-light ui-button ui-widget ui-corner-all ui-button-text-only']"

driver.find_element(By.ID, "close-button").click()
driver.find_element(By.ID, "cookieconsent-button").click()
    
# open up all the "+" buttons
table_buttons = driver.find_element(By.XPATH, table_button_xpath)
        
for i in list(range(1, 10)):
        
    driver.find_element(By.XPATH, table_button_xpath).click()
        
# find all the polygons
polygons = driver.find_elements(By.TAG_NAME, 'path')
    
label_xpath = "//*[name()='svg']//*[name()='g' and @id = 'ballons')]//*[name()='g']//*[name()='tspan']"
    
for polygon in polygons :
        
    action.move_to_element(polygon)
        
    labels_by_xpath = driver.find_elements(By.XPATH, label_xpath)
    labels_by_css_selector = driver.find_elements(By.CSS_SELECTOR, "svg>#ballons>g>text>tspan")
    

Both labels_by_xpath and labels_by_css_selector return a list of 0 elements. I've tried many versions of both the xpath and css selector approach, along with using WebDriverWait, but I can't get it to return the capacity impact values.

HTML screenshot is also copied below (to be clear, the number I need to scrape is the "50" text in the tag.

enter image description here

Any help is appreciated! Thank you, Sophie


Solution

  • The solution to your problem is with the locator. Here is the updated locator to select the desired element.

    CSS Selector :

    svg>[id^='balloons']>g:nth-child(2)>text:nth-child(2)>tspan
    

    try this to get the element Capacity 50