Search code examples
pythonproxytorrequest-headersseleniumwire

How do I set up a proxy when I get the message: "... [WinError 10061]" using Selenium Wire and Tor browser and set header capture – in python?


To be able to capture headers (the Selenium library does not support this) I decided to use the Selenium Wire library. I found the following website: https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/snippets/60 that explains how to use the Selenium Wire library with the Tor browser. However, when I use the code from this page I get a connection error, quote "Error connecting to SOCKS5 proxy 127.0.0.1:9150: [WinError 10061]". I also can't set header capture according to the documentation of the Selenium Wire library: https://github.com/wkeeling/selenium-wire . The documentation states that this should be according to the formula:

def interceptor(request):
    del request.headers['Referer']  # Remember to delete the header first
    request.headers['Referer'] = 'some_referer'  # Spoof the referer

driver.request_interceptor = interceptor
driver.get(...)

# All requests will now use 'some_referer' for the referer

However, it does not explain what a request is or why a function reference is not interceptor().


Solution

  • As for the proxy settings from the example, for this to work, you must first open the Tor browser. In the following code, this is done by a script. This is because in order to set up a proxy, it must first work. When it comes to capturing headers, you should follow the Selenium Wire documentation exactly. Below is a working script that allows you to capture headers:

    import os
    import time
    
    from seleniumwire import webdriver
    from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
    from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
    
    def firefoxdriver(my_url):
        """Preparing of the Tor browser for the work."""
        # The location of the Tor Browser bundle
        #   for my laptop.
        # tbb_dir = r'C:\Users\Oliver\Desktop\Tor Browser'
        #   for my mainframe.
        tbb_dir = r'C:\Users\olive\OneDrive\Pulpit\Tor Browser'
    
        # Set the Tor Browser binary and profile.
        tb_binary = tbb_dir + r'\Browser\firefox.exe'
        tb_profile = tbb_dir + r'\Browser\TorBrowser\Data\Browser\profile.default'
        binary = FirefoxBinary(tb_binary)
        profile = FirefoxProfile(tb_profile)
    
        # Open Tor Browser to allow to work on the proxy.
        torexe = os.popen(tb_binary)
    
        # Disable Tor Launcher to prevent it connecting the Tor Browser to 
        #   Tor directly.
        os.environ['TOR_SKIP_LAUNCH'] = '1'
        os.environ['TOR_TRANSPROXY'] = '1'
    
        # Disable HTTP Strict Transport Security (HSTS) in order to have 
        #   seleniumwire between the browser and Tor.
        profile.set_preference("security.cert_pinning.enforcement_level", 0)
        profile.set_preference("network.stricttransportsecurity.preloadlist", False)
    
        # Tell Tor Button it is OK to use seleniumwire
        profile.set_preference("extensions.torbutton.local_tor_check", False)
        profile.set_preference("extensions.torbutton.use_nontor_proxy", True)
    
        # Enable JavaScript at all, otherwise JS stays disabled regardless 
        #   of the Tor Browser's security slider value.
        profile.set_preference("browser.startup.homepage_override.mstone", "68.8.0")
    
        # Configure seleniumwire to upstream traffic to Tor running on 
        #   port 9150.
        # It is possible to increase/decrease the timeout if you are trying
        #   to a load page that requires a lot of requests. It is in 
        #   seconds.
        options = {
            'proxy': {
                'http': 'socks5h://127.0.0.1:9150',
                'https': 'socks5h://127.0.0.1:9150',
                'connection_timeout': 20
            }
        }
    
        driver = webdriver.Firefox(firefox_profile=profile,
                                    firefox_binary=binary,
                                    seleniumwire_options=options)
    
        return driver
    
    def interceptor(request):
        """
        Adding the headers to the browser - create a request interceptor.
        """
        del request.headers['User-Agent']
        request.headers['User-Agent'] = ('Mozilla/5.0 (Windows NT 10.0;rv:102.0)'+
            ' Gecko/20100101 Firefox/102.0')
        del request.headers['Accept']
        request.headers['Accept'] = ('text/html,application/xhtml+xml,application'+
            '/xml;q=0.9,image/avif,image/webp,*/*;q=0.8')
        del request.headers['Accept-Language']
        request.headers['Accept-Language'] = 'en-US,en;q=0.5'
    
    # Variable with the URL of the website.
    my_url = 'https://httpbin.org/headers'
    
    # Preparing of the Tor browser for the work.
    driver = firefoxdriver(my_url)
    
    # Adding the headers to the browser - set the interceptor on the 
    #   driver.
    driver.request_interceptor = interceptor
    
    # Loads the website code as the Selenium object.
    driver.get(my_url)
    
    # Access requests via the `requests` attribute.
    for request in driver.requests:
        if request.response:
            print(
                request.url,
                request.response.status_code,
                request.response.headers['Content-Type'],
                request.headers
            )
    
    time.sleep(15)
    driver.quit()