Search code examples
pythonseleniumlocation-href

Using selenium in python for capturing the links in a web


I'm trying to capture the links of a webpage using Selenium in Python. My initial code is:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pandas as pd
import time
from tqdm import tqdm
from selenium.common.exceptions import NoSuchElementException
driver.get('https://www.lovecrave.com/shop/')

Then, I identified all the products (12) in the web by using:

perso_flist = driver.find_elements_by_xpath("//p[@class='excerpt']")

Then, I want to capture the links for each product by using:

listOflinks = []
for i in perso_flist:
    link_1=i.find_elements_by_xpath(".//a[@href[1]]")
    listOflinks.append(link_1)
print(listOflinks

And my output looks like:

print(listOflinks)  # 12 EMPTY VALUES
[[], [], [], [], [], [], [], [], [], [], [], []]

What is wrong with my code? I'll appreciate your help.


Solution

  • Basically you loop through the a tags and get the attribute href.

    hrefs=[x.get_attribute("href") for x in driver.find_elements_by_xpath("//p[@class='excerpt']/following-sibling::a[1]")]
    print(hrefs)
    

    or xpath //li/a[@class='full-link']

    Outputs

    ['https://www.lovecrave.com/products/duet-pro/',
     'https://www.lovecrave.com/products/vesper/',
     'https://www.lovecrave.com/products/wink/',
     'https://www.lovecrave.com/products/duet/',
     'https://www.lovecrave.com/products/duet-flex/',
     'https://www.lovecrave.com/products/flex/',
     'https://www.lovecrave.com/products/pocket-vibe/',
     'https://www.lovecrave.com/products/bullet/',
     'https://www.lovecrave.com/products/cuffs/',
     'https://www.lovecrave.com/shop/gift-card/',
     'https://www.lovecrave.com/shop/leather-case/',
     'https://www.lovecrave.com/shop/vesper-replacement-charger/']