I'm trying to capture the links of a webpage using Selenium in Python. My initial code is:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pandas as pd
import time
from tqdm import tqdm
from selenium.common.exceptions import NoSuchElementException
driver.get('https://www.lovecrave.com/shop/')
Then, I identified all the products (12) in the web by using:
perso_flist = driver.find_elements_by_xpath("//p[@class='excerpt']")
Then, I want to capture the links for each product by using:
listOflinks = []
for i in perso_flist:
link_1=i.find_elements_by_xpath(".//a[@href[1]]")
listOflinks.append(link_1)
print(listOflinks
And my output looks like:
print(listOflinks) # 12 EMPTY VALUES
[[], [], [], [], [], [], [], [], [], [], [], []]
What is wrong with my code? I'll appreciate your help.
Basically you loop through the a tags and get the attribute href.
hrefs=[x.get_attribute("href") for x in driver.find_elements_by_xpath("//p[@class='excerpt']/following-sibling::a[1]")]
print(hrefs)
or xpath //li/a[@class='full-link']
Outputs
['https://www.lovecrave.com/products/duet-pro/',
'https://www.lovecrave.com/products/vesper/',
'https://www.lovecrave.com/products/wink/',
'https://www.lovecrave.com/products/duet/',
'https://www.lovecrave.com/products/duet-flex/',
'https://www.lovecrave.com/products/flex/',
'https://www.lovecrave.com/products/pocket-vibe/',
'https://www.lovecrave.com/products/bullet/',
'https://www.lovecrave.com/products/cuffs/',
'https://www.lovecrave.com/shop/gift-card/',
'https://www.lovecrave.com/shop/leather-case/',
'https://www.lovecrave.com/shop/vesper-replacement-charger/']