I'm trying to scrape all of the table from this website : https://qmjhldraft.rinknet.com/results.htm?year=2018
When the XPath is a simple td (like the names for example), I can scrape the table with the simple xpath being something like this :
players = driver.find_elements_by_xpath('//tr[@rnid]/td[4]')
And I can scrape the players name using this code :
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
PATH = 'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(PATH)
driver.get('https://qmjhldraft.rinknet.com/results.htm?year=2018')
try:
elements = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//tr[@rnid]/td[1]"))
)
finally:
players = driver.find_elements_by_xpath('//tr[@rnid]/td[4]')
for player in players[:5]:
pl = player.text
print(pl)
But when I get to the "Height" section, I can't find the write XPath. I guess this has to do with the td having a class, "ht-itemVisibility1", changing the way to scrape it, I've tried a few different ways to scrape it, like :
('//tr/td[@class="ht-itemVisibility1"][1]')
('//tr/td[@class="ht-itemVisibility1"][5]')
('//tr[@rnid]/td[5]')
to no avail. Can someone enlighten me on the way to capature this XPath with td class? Thanks a lot.
Try this
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://qmjhldraft.rinknet.com/results.htm?year=2018')
try:
elements = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//tr[@rnid]/td[1]"))
)
finally:
players = driver.find_elements_by_xpath('//tr[@rnid]/td[4]')
for player in players[:5]:
pl = player.text
print(pl)
players_height = driver.find_elements_by_xpath('//tr/td[@class="ht-itemVisibility1"][1]')
for player in players_height[:5]:
pl = player.text
print(pl)
players_last_team = driver.find_elements_by_xpath('//tr/td[@class="ht-itemVisibility1"][5]')
for player in players_last_team[:5]:
pl = player.text
print(pl)
don't know why it wasn't working for you but it's working fine with me.