Search code examples
pythonlinuxseleniumtor-browser-bundle

Can't use TorBrowser with Selenium ? (Python Linux)


I'd like to run TorBrowser through selenium.

I've been able to use the tor network through selenium using the tor daemon and a firefox instance.

I'd like to use TorBrowser to be able to run multiple instances using different tor exit relay. I know it's possible to run multiple TorBrowser instances (without selenium) by specifying the ports we want to use in each TorBrowser bundle by adding this lines to Browser/TorBrowser/Data/Browser/profile.default/user.js :

user_pref("network.proxy.socks_port", ChangeToTheDesiredPort1);
user_pref("extensions.torlauncher.control_port", ChangeToTheDesiredPort2);

Here is the code I used to try and launch TorBrowser through Selenium. I'm trying to do the things step by step so, in this test, I'm using a freshly new TorBrowser archive without the personalized profile :

#!/usr/bin/python
# -*- coding: UTF-8 -*-

from selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.firefox.options import Options as FirefoxOptions
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile

profile = FirefoxProfile("tor-browser_en-US/Browser/TorBrowser/Data/Browser/profile.default")
profile.set_preference('network.proxy.type', 1)
profile.set_preference('network.proxy.socks', '127.0.0.1')
profile.set_preference('network.proxy.socks_port', 9050)

options = FirefoxOptions()
options.profile = profile
binary = FirefoxBinary("tor-browser_en-US/Browser/start-tor-browser")

print "0"
driver = webdriver.Firefox(options=options, firefox_binary=binary)
print "1"
driver.get('https://check.torproject.org/')

My script gets blocked once I try to instanciate the webdriver. The output of the script prints 0, never prints 1 and the TorBrowser never tries to connect to https://check.torproject.org/

If I replace

binary = FirefoxBinary("tor-browser_en-US/Browser/start-tor-browser")

by

binary = FirefoxBinary("tor-browser_en-US/Browser/firefox")

the script is not blocking anymore and TorBrowser tries to contact https://check.torproject.org/ but the TorBrowser never connects to the tor network, resulting in the following error :

selenium.common.exceptions.WebDriverException: Message: Reached error page: about:neterror?e=proxyConnectFailure&u=https://check.torproject.org/&c=UTF-8&d=Firefox is configured to use a proxy server that is refusing connections.

Some info about my config (64 bits):

  • geckodriver 0.30.0
  • TorBrowser 11.0.4
  • Python 2.7.17
  • Ubuntu 18.04.1

I've made my example script as simple as possible but I've tested a lot of things in the last 2 days and found nothing relevant.

Thank in advance for your answers.


Solution

  • I finally solved my problem (tested on a VM with Ubuntu 20.04 to be able to install selenium4). I've been able to launch multiple TorBrowser instances with different exit nodes using tbselenium (requires selenium4). https://github.com/webfp/tor-browser-selenium

    Here is a sample code

    from stem.control import Controller
    from tbselenium.tbdriver import TorBrowserDriver
    import tbselenium.common as cm
    from tbselenium.utils import launch_tbb_tor_with_stem
    from selenium.webdriver.common.utils import free_port
    import tempfile
    from os.path import join
    import time
    
    
    tbb_dir = "PathToTorBrowserBundle"
    gecko = "PathToGeckodriver"
    
    socks_port = free_port()
    control_port = free_port()
    tor_data_dir = tempfile.mkdtemp()
    torrc = {'ControlPort': str(control_port),
            'SOCKSPort': str(socks_port),
            'DataDirectory': tor_data_dir}
    tor_binary = join(tbb_dir, cm.DEFAULT_TOR_BINARY_PATH)
    tor_process = launch_tbb_tor_with_stem(tbb_path=tbb_dir, torrc=torrc, tor_binary=tor_binary)
    
    Controller.from_port(port=control_port).authenticate()
    driver = TorBrowserDriver(tbb_dir, socks_port=socks_port, control_port=control_port, tor_cfg=cm.USE_STEM, executable_path=gecko)
    driver.load_url("https://check.torproject.org")
    time.sleep(5000)
    tor_process.kill()