python selenium xpath css-selectors webdriverwait

Using python and Selenium to scrape the innerText within an HTML element?

I wrote a script that uses the selenium and pyautogui modules to login and scrape a value from an element and print it, but it's printing two dashes --.

Here is the HTML that contains the value 417 which I want to retrieve:

<p id="totReqCountVal" class="trailer-0 avenir-regular font-size-4 text-green js-total-requests">417</p>

This is the relevant code I have tried:

from selenium import webdriver
from selenium.webdriver.common.by import By

browser.get('website_to_be_scraped')
browser.find_element(By.ID, 'totReqCountVal')

I then tried:

views = browser.find_element(By.ID, 'totReqCountVal')
    print(views)

which returns:

(session="12e48df447f7df855a1ee596ba609a30", element="1027ec31-8cb8-4758-b4b0-82b85628ed6c")

With some help I have also tried the following:

Using CSS_SELECTOR and text attribute:

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p#totReqCountVal[class$='js-total-requests']"))).text)
Using XPATH and get_attribute("innerHTML"):

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p[@id='totReqCountVal' and contains(@class, 'js-total-requests')]"))).get_attribute("innerHTML"))

added the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

I have checked through devtools if the locator strategies identifies the element uniquely, checked for iframes, and shadow root.

How do I retrieve the 417 value?

Solution

views is the WebElement which on printing rightly prints:

(session="12e48df447f7df855a1ee596ba609a30", element="1027ec31-8cb8-4758-b4b0-82b85628ed6c")

Solution

To print the text 417 you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

Using CSS_SELECTOR and text attribute:

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p#totReqCountVal[class$='js-total-requests']"))).text)

Using XPATH and get_attribute("innerHTML"):

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p[@id='totReqCountVal' and contains(@class, 'js-total-requests')]"))).get_attribute("innerHTML"))

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

References

Link to useful documentation:

get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium