I try to scrape a website using Selenium. I got a problem when I try to get the coins name. because there're 2 elements inside 'td' How can I get rid of another element I don't want. or keep track to only its first element. (I found this post but I'm not sure if it answer my issue or not)
This is my whole code
#driver chrome def
website = 'https://www.bitkub.com/fee/cryptocurrency'
path = r"C:\\Users\\USER\\Downloads\\chromedriver.exe"
options = Options()
options.add_argument("start-maximized")
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
driver.get(website)
#giving variable
coin_name = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tbody//tr//td[2]//span")))]
chain_name = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tbody//tr//td[3]//div")))]
withdrawal_fees = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tbody//tr//td[4]//div")))]
#print(coin_name)
#print(chain_name)
#print(withdrawal_fees)
#for loop make list
for coin, chains, wdf in zip(coin_name, chain_name, withdrawal_fees):
print("Coin name: {} Chain: {} Fee: {}".format(coin, chains, wdf))
The input of coin_name (which I mentioned that it got 2 elements)
['Civic(CVC)', '(CVC)', 'Bitcoin SV(BSV)', '(BSV)', 'Ethereum(ETH)', '(ETH)', 'Bitkub Coin(KUB)', '(KUB)', 'Compound(COMP)', '(COMP)', 'Curve DAO Token(CRV)', '(CRV)', .... ]
This is how element on the website look like
I wanted input to look like this so I can make dataframe out of it
['Civic(CVC)', 'Bitcoin SV(BSV)', 'Ethereum(ETH)', 'Bitkub Coin(KUB)', 'Compound(COMP)', 'Curve DAO Token(CRV)', .... ]
As per your current output:
['Civic(CVC)', '(CVC)', 'Bitcoin SV(BSV)', '(BSV)', 'Ethereum(ETH)', '(ETH)', 'Bitkub Coin(KUB)', '(KUB)', 'Compound(COMP)', '(COMP)', 'Curve DAO Token(CRV)', '(CRV)', .... ]
You can skip every alternative element and create a new list using list comprehension as follows:
# coin_name = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tbody//tr//td[2]//span")))]
coin_name = ['Civic(CVC)', '(CVC)', 'Bitcoin SV(BSV)', '(BSV)', 'Ethereum(ETH)', '(ETH)', 'Bitkub Coin(KUB)', '(KUB)', 'Compound(COMP)', '(COMP)', 'Curve DAO Token(CRV)', '(CRV)']
res = [coin_name[i] for i in range(len(coin_name)) if i % 2 == 0]
print (res)
Console Output:
['Civic(CVC)', 'Bitcoin SV(BSV)', 'Ethereum(ETH)', 'Bitkub Coin(KUB)', 'Compound(COMP)', 'Curve DAO Token(CRV)']