I try to scrape the top 20 holder of a token on the ERC-20 chain. I use for that selenium. It seems like the xpath's dont load/didnt have enough time?
I try to load this page: https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances
I tried it with implicit wait and with explicit wait. I can even see, when I run the webdriver that the side is load, but it never found the path...
Code with explicity wait:
options = Options()
ptions.add_argument("--disable-dev-shm-using")
options.add_argument("--no-sandbox")
driver = webdriver.Chrome(chrome_options=options)
driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
wait = WebDriverWait(driver, 10, poll_frequency=1)
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="maintable"]/div[3]/table/tbody/')))
Error:
selenium.common.exceptions.TimeoutException: Message:
Yep not even a message...
Code with implicit:
options = Options()
ptions.add_argument("--disable-dev-shm-using")
options.add_argument("--no-sandbox")
driver = webdriver.Chrome(chrome_options=options)
driver.implicitly_wait(10)
driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
for i in range(1,20):
req = driver.find_element_by_xpath('//*[@id="maintable"]/div[3]/table/tbody/tr['+str(i)+']/td[2]/span/a')
Error:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id="maintable"]/div[3]/table/tbody/tr[1]/td[2]/span/a"}
So like I say it looks like the driver has not enough time to load the page but even with 20,30,... secounds they dont find the path.
Also when I copy the xpath from the browser opened by the script I can find the xpath.
The Table is present inside an iframe
you need to switch to iframe first to access the table.
Induce WebDriverWait
() and wait for frame_to_be_available_and_switch_to_it
()
Induce WebDriverWait
() and wait for visibility_of_all_elements_located
()
code:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver=webdriver.Chrome()
driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.ID,"tokeholdersiframe")))
elements=WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.XPATH,'//*[@id="maintable"]/div[3]/table/tbody//tr/td[2]//a')))
for ele in elements:
print(ele.get_attribute('href'))
If you want to fetch first 20 token then use this.
for ele in elements[:20]:
print(ele.get_attribute('href'))