Search code examples
pythonseleniumselenium-webdriveryoutubewebdriver

Get Youtube video title using classname and text attribute using Selenium and Python


Hi I'm using Python Selenium Webdriver to get Youtube title but keep getting more info than I'd like. The line is: driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text

Is there any way to fix it and make it more efficient so that it displays only the title. Here is the test script Im using:

from selenium import webdriver as wd
from time import sleep as zz

driver = wd.Firefox(executable_path=r'./geckodriver.exe')
driver.get('https://www.youtube.com/watch?v=wma0szfIafk')
zz(4)
test_atr = driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text
print(test_atr)

Solution

  • To print the title text OBI-WAN KENOBI Official Trailer (2022) Teaser you can use either of the following Locator Strategies:

    • Using css_selector and get_attribute("innerHTML"):

      print(driver.find_element(By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer").get_attribute("innerHTML"))
      
    • Using xpath and text attribute:

      print(driver.find_element(By.XPATH, "//h1[@class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[@class='style-scope ytd-video-primary-info-renderer']").text)
      

    Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

    • Using CSS_SELECTOR and text attribute:

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer"))).text)
      
    • Using XPATH and get_attribute("innerHTML"):

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1[@class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[@class='style-scope ytd-video-primary-info-renderer']"))).get_attribute("innerHTML"))
      
    • Note : You have to add the following imports :

      from selenium.webdriver.support.ui import WebDriverWait
      from selenium.webdriver.common.by import By
      from selenium.webdriver.support import expected_conditions as EC
      
    • Console Output:

      OBI-WAN KENOBI Official Trailer (2022) Teaser
      

    You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


    References

    Link to useful documentation: