Selenium web scraping: how to prioritize a tab over another

Project: saving all the URLs/titles from https://theuselessweb.com/

Code to test (only 3 pages and print not save):

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from time import sleep

PATH = r"C:\Users\XXX\Documents\scraping\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://theuselessweb.com/")
driver.switch_to.window(driver.window_handles[-1])
button = driver.find_element_by_id("button")

for i in range(3):
    button.click()
    sleep(2)
    driver.switch_to.window(driver.window_handles[-1])
    print(driver.current_url)
    print(driver.title)
    driver.close()

Error(s):

DevTools listening on ws://127.0.0.1:60235/devtools/browser/a5ea4ab0-fba6-4a34-b0ee-8926876c554f
[11636:4168:0626/143411.535:ERROR:device_event_log_impl.cc(214)] [14:34:11.535] USB: usb_device_handle_win.cc:1058 Failed to read descriptor from node connection: Ein an das System angeschlossenes Gerõt funktioniert nicht. (0x1F)
[11636:4168:0626/143411.552:ERROR:device_event_log_impl.cc(214)] [14:34:11.552] USB: usb_device_handle_win.cc:1058 Failed to read descriptor from node connection: Ein an das System angeschlossenes Gerõt funktioniert nicht. (0x1F)
[11636:4168:0626/143411.555:ERROR:device_event_log_impl.cc(214)] [14:34:11.555] USB: usb_device_handle_win.cc:1058 Failed to read descriptor from node connection: Ein an das System angeschlossenes Gerõt funktioniert nicht. (0x1F)
https://thatsthefinger.com/           #this is what I want
The finger, deal with it.             #this is what I want
Traceback (most recent call last):
  File "C:\Users\XXX\Documents\scraping\programs\linkscraping.py", line 16, in <module>
    button.click()
  File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\remote\webelement.py", line 80, in click
    self._execute(Command.CLICK_ELEMENT)
  File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\remote\webelement.py", line 633, in _execute
    return self._parent.execute(command, params)
  File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchWindowException: Message: no such window: target window already closed
from unknown error: web view not found
  (Session info: chrome=91.0.4472.124)

It prints out the URL and title of the first website and then crashes. Also everytime i run the driver.get(ANYURL) command, it opens the link AND the Chrome settings (chrome://settings/triggeredResetProfileSettings). Maybe this messes it up, anyway it would be really helpful if i could get rid of this unwanted window too.

Solution

Here is a solution to the problem. it still opens every link but since it's headless it's not visible to the user.

In this case, X is the number of random websites you want to extract

The code opens the site and then clicks the button the number of times you want in accordance with x and then goes on each one and logs the results. At the end, it closes Chrome.

from selenium.webdriver.chrome.options import Options
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

options = Options()
options.headless = True
driver = webdriver.Chrome(
    ChromeDriverManager().install(), 
    options=options
)

x = 10

driver.get('https://theuselessweb.com/')
button = button = driver.find_element_by_id("button")

for i in range(x):
    button.click()

for i in range(x):
    driver.switch_to.window(driver.window_handles[i+1])
    print(driver.current_url)
    print(driver.title)

driver.quit()