Search code examples
pythonseleniumxpathcss-selectorswebdriverwait

How to extract all the title text for Graphics Card Posting on NewEgg using Selenium Python


I am currently working on a Python script to pull information on RTX 3080 graphics cards and I am running into issues getting my script to grab each Graphics card title for each posting on the page. My script is as follows

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import time


options = Options()

PATH = 'C://Program Files (x86)//chromedriver.exe'

driver = webdriver.Chrome(PATH)

url = 'https://www.newegg.com/p/pl?d=RTX+3080'
actions = ActionChains(driver)

driver.maximize_window()
driver.get(url)

card_path = '/html/body/div[8]/div[3]/section/div/div/div[1]/div/dl[1]/dd/ul[2]/li/a'
desktop_graphics_cards = driver.find_element(By.XPATH, card_path)
desktop_graphics_cards.click()
time.sleep(5)

graphics_card = []
shipping_cost = []
price = []
total_cost = []


card_data = driver.find_elements(By.XPATH, '//div[@class = "item-cell"]')
print(card_data)
for card in card_data:
    graphics_card0 = card.find_element(By.XPATH, '//a[@title = "View Details"]').get_attribute('innerText')
    print(graphics_card0)

Whenever I run my script, it will give me the title of the first graphics card but repeat that same title for each of the postings of the remaining graphics card titles. For example, when I do print(len(card_data)), it returns 46 postings (The number of graphics cards for each "item-cell" on the page), but my print(graphics_card0) returns something like "MSI Gaming GeForce RTX 3080 10GB GDDR6X PCI Express 4.0 ATX Video Card RTX 3080 GAMING Z TRIO 10G LHR" 46 times. Any suggestions on how to proceed? I'll also attach pictures of the html below:

Screenshot 1:

enter image description here

Screenshot 2:

enter image description here


Solution

  • To extract all the titles e.g. MSI Gaming GeForce RTX 3080 10GB GDDR6X PCI Express 4.0 ATX Video Card RTX 3080 GAMING Z TRIO 10G LHR, etc using Selenium you can use List Comprehension and you can use either of the following Locator Strategies:

    • Using css_selector and get_attribute("innerHTML"):

      driver.get("https://www.newegg.com/p/pl?d=RTX+3080")
      print([my_elem.get_attribute("innerHTML") for my_elem in driver.find_elements(By.CSS_SELECTOR, "a[title = 'View Details']")])
      
    • Using xpath and text attribute:

      driver.get("https://www.newegg.com/p/pl?d=RTX+3080")
      print([my_elem.text for my_elem in driver.find_elements(By.XPATH, "//a[@title = 'View Details']")])
      

    Ideally you need to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

    • Using CSS_SELECTOR and text attribute:

      driver.get("https://www.newegg.com/p/pl?d=RTX+3080")
      print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a[title = 'View Details']")))])
      
    • Using XPATH and get_attribute("innerHTML"):

      driver.get("https://www.newegg.com/p/pl?d=RTX+3080")
      print([my_elem.get_attribute("innerText") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//a[@title = 'View Details']")))])
      
    • Console Output:

      ['GIGABYTE A520M S2H mATX AM4 4+3 Phases Digital PWM, GIGABYTE Gaming GbE LAN, NVMe PCIe 3.0 x4 M.2, 3 Display Interfaces, Q-Flash Plus, RGB Fusion 2.0, Motherboard', 'MSI Gaming GeForce RTX 3080 10GB GDDR6X PCI Express 4.0 ATX Video Card RTX 3080 GAMING Z TRIO 10G LHR', 'Goldshell LT5 PRO 2455MH/S ASIC LTC Scrypt algorithm mining Dogecoin + Litecoin miner power consumption of 3100W', 'Goldshell Mini-DOGE 185MH/S Ordinary version (without psu)DOGE& LTC Mining Machine Low noise Small&simple Home Mining Home Riching', 'Goldshell LT5 PRO 2455MH/S ASIC LTC Scrypt algorithm mining Dogecoin + Litecoin miner power consumption of 3100W', 'Goldshell Mini-DOGE 185MH/S Ordinary version (without psu)DOGE& LTC Mining Machine Low noise Small&simple Home Mining Home Riching', 'ASUS TUF Gaming NVIDIA GeForce RTX 3080 V2 OC Edition Graphics Card (PCIe 4.0, 10GB GDDR6X, LHR, HDMI 2.1, DisplayPort 1.4a, Dual Ball Fan Bearings, Military-grade Certification, GPU Tweak II)', 'ABS Gladiator Gaming PC - Windows 10 Home - Intel i7 11700KF - GeForce RTX 3080 - G.Skill TridentZ RGB 16GB DDR4 3200MHz - 1TB Intel M.2 NVMe SSD', 'ABS ROG Gundam Limited Edition Gaming PC - Windows 10 Home - Intel i7 11700K - STRIX Gundam GeForce RTX 3080 - G.Skill 32GB 3200MHz - 1TB Intel M.2 - STRIX 360mm AIO', 'ABS Gladiator Gaming PC - Windows 10 Home - Intel i9 11900KF - GeForce RTX 3080 - G.Skill TridentZ RGB 16GB DDR4 3200 MHz - 1TB Intel M.2 NVMe SSD', 'Goldshell HS-BOX 235GH/S(without PSU)BOX & HNSB Mining Machine Low noise Small&simple Home Mining Home Riching', 'Goldshell CK-BOX 1050GH/S(without psu) CKB Mining Machine Low noise Small&simple Home Mining Home Riching', 'Adamant Custom 32-Core Video Editing Rendering Modelling Workstation Computer AMD Threadripper 3970X 3.7Ghz TRX40 2 ZENITH 256Gb DDR4 2x2TB NVMe 1800MBs SSD 10TB HDD 1000W Geforce RTX 3080', 'Adamant Custom 14-Core Rendering Modelling Workstation Computer Intel i9 10940X 3.3Ghz X299 AORUS 128Gb DDR4 2x2TB NVMe SSD 10TB HDD 850W Toughpower PSU Geforce RTX 3080 10Gb', 'GIGABYTE AORUS GeForce RTX 3080 MASTER 10GB GDDR6X PCI Express 4.0 ATX Video Card GV-N3080AORUS M-10GD (rev. 3.0) (LHR)', 'ABS Legend Gaming PC - Windows 10 Home - Intel i9 11900KF - GeForce RTX 3080 Ti - G.Skill TridentZ RGB 16GB DDR4 3200 MHz - 1TB Intel M.2 NVMe SSD', 'GIGABYTE Gaming OC GeForce RTX 3080 10GB GDDR6X PCI Express 4.0 ATX Video Card GV-N3080GAMING OC-10GD (rev. 2.0) (LHR)', 'MSI Suprim GeForce RTX 3080 10GB GDDR6X PCI Express 4.0 Video Card RTX 3080 SUPRIM X 10G LHR', 'MSI Ventus GeForce RTX 3080 PCI Express 4.0 Video Card RTX 3080 VENTUS 3X PLUS 10G OC LHR', 'ASUS ROG Strix NVIDIA GeForce RTX 3080 Ti OC Edition Gaming Graphics Card (PCIe 4.0, 12GB GDDR6X, HDMI 2.1, Axial-tech Fan Design, 2.9-Slot, Super Alloy Power II, ASUS Auto-Extreme Technology)', 'ASUS TUF Gaming GeForce RTX 3080 Ti 12GB GDDR6X PCI Express 4.0 Video Card TUF-RTX3080TI-O12G-GAMING', 'MSI Gaming GeForce RTX 3080 Ti 12GB GDDR6X PCI Express 4.0 Video Card RTX 3080 Ti Gaming X Trio 12G', 'NVIDIA GeForce RTX 3080 Ti Founders Edition 12GB GDDR6X', 'ZOTAC AMP Holo GeForce RTX 3080 Ti 12GB GDDR6X PCI Express 4.0 Video Card ZT-A30810F-10P', 'GIGABYTE AORUS GeForce RTX 3080 Ti 12GB GDDR6X PCI Express 4.0 ATX Video Card GV-N308TAORUS M-12GD', 'ASUS ROG STRIX GeForce RTX 3080 10GB GDDR6X PCI Express 4.0 x16 ATX Video Card ROG-STRIX-RTX3080-O10G-WHITE-V2 (LHR)', 'MSI Gaming Desktop Aegis RS 11TF-223US Intel Core i7 11th Gen 11700K (3.60 GHz) 32 GB DDR4 2 TB HDD 2 TB PCIe Gen3 SSD NVIDIA GeForce RTX 3080 Ti Windows 10 Home 64-bit', 'MSI Gaming Desktop Aegis R 11TE-097US Intel Core i7 11th Gen 11700 (2.50GHz) 16GB DDR4 3TB HDD 1 TB PCIe SSD NVIDIA GeForce RTX 3080 Windows 10 Home 64-bit', 'Skytech PRISM II Gaming PC Desktop - AMD Ryzen 7 5800X, RTX 3080 10GB GDDR6X, 16GB DDR4 3200, 1TB Gen4 SSD, RGB Fans, AC Wi-Fi, Windows 10 Home 64-bit, Black', 'MSI Ventus GeForce RTX 3080 10GB GDDR6X PCI Express 4.0 Video Card RTX 3080 VENTUS 3X 10G OC LHR', 'GIGABYTE AORUS GeForce RTX 3080 XTREME WATERFORCE WB 10G (rev. 2.0) Graphics Card, WATERFORCE Water Block Cooling System, 10GB 320-bit GDDR6X, GV-N3080AORUSX WB-10GD Rev2.0 Video Card (LHR)', 'Skytech Shiva Gaming PC Desktop - AMD Ryzen 5 5600X 3.7 GHz, RTX 3080 10 GB GDDR6X, 16 GB DDR4 3200, 1 TB NVMe SSD, B550 Motherboard, 750W Gold PSU, AC WiFi, Windows 10 Home 64-bit', 'CLX SET VR-Ready Gaming Desktop - Liquid Cooled Intel Core i9 10900KF 3.7Ghz 10-Core Processor, 32GB DDR4 Memory, GeForce RTX 3080 10GB GDDR6X Graphics, 960GB SSD, 4TB HDD, WiFi, Win 10 Home 64-bit', 'AORUS MODEL X Gaming PC Computer Desktop (Intel i9-11900K, NVIDIA GeForce RTX 3080 10GB GDDR6X, 16GB DDR4 RAM, 3TB M.2 SSD) - GB-AMXI9N8A-2051', 'GIGABYTE AORUS GeForce RTX 3080 MASTER 10G (rev. 2.0) Graphics Card, Max Covered Cooling, 10GB 320-bit GDDR6X, GV-N3080AORUS M-10GD Rev2.0 Video Card', 'iBUYPOWER Gaming Desktop Revolt 3 i7BG Intel Core i7 11th Gen 11700KF (3.60GHz) 16GB DDR4 1 TB NVMe SSD NVIDIA GeForce RTX 3080 Windows 10 Home 64-bit', 'ASUS ROG Strix G15CE Gaming & Entertainment Desktop PC (Intel i7-11700KF 8-Core, 32GB RAM, 1TB m.2 SATA SSD + 3TB HDD (3.5), NVIDIA GeForce RTX 3080 Ti, Wifi, Bluetooth, 6xUSB 3.1, Win 10 Pro)', 'iBUYPOWER Gaming Desktop Trace5MR1003Ti Intel Core i7 11th Gen 11700KF (3.60GHz) 16GB DDR4 1 TB PCIe SSD NVIDIA GeForce RTX 3080 Ti Windows 10 Home 64-bit', 'GIGABYTE Vision GeForce RTX 3080 Ti 12GB GDDR6X PCI Express 4.0 ATX Video Card GV-N308TVISION OC-12GD', 'GIGABYTE AORUS GeForce RTX 3080 XTREME 10GB GDDR6X PCI Express 4.0 ATX Video Card GV-N3080AORUS X-10GD (rev. 2.0) (LHR)', 'ZOTAC Trinity OC GeForce RTX 3080 Ti 12GB GDDR6X PCI Express 4.0 ATX Video Card ZT-A30810J-10P', 'CLX SET VR-Ready Gaming Desktop - Liquid Cooled AMD Ryzen 9 5900X 3.7Ghz 12-Core Processor, 64GB DDR4 Memory, GeForce RTX 3080 10GB GDDR6X Graphics, 1TB SSD, 6TB HDD, WiFi, Windows 10 Home 64-bit', 'ASUS ROG STRIX GeForce RTX 3080 Ti 12GB GDDR6X PCI Express 4.0 x16 Video Card ROG-STRIX-LC-RTX3080TI-O12G-GAMING', 'GIGABYTE Gaming GeForce RTX 3080 Ti 12GB GDDR6X PCI Express 4.0 ATX Video Card GV-N308TGAMING OC-12GD']
      
    • Note : You have to add the following imports :

      from selenium.webdriver.support.ui import WebDriverWait
      from selenium.webdriver.common.by import By
      from selenium.webdriver.support import expected_conditions as EC
      

    Outro

    Link to useful documentation: