Search code examples
python-3.xseleniumproxyselenium-chromedriver

how to set proxy with authentication in selenium chromedriver python?


I am creating a script that crawls one website to gather some data but the problem is that they blocked me after too many requests but using a proxy I can send more request then currently I do. I have integrated proxy with chrome option --proxy-server

options.add_argument('--proxy-server={}'.format('http://ip:port'))

but I am using a paid proxy so it requires authentication and as below screenshot it gives the alert box for username and password

selenium proxy authentication alert box Then I tried to use it with username and password

options.add_argument('--proxy-server={}'.format('http://username:password@ip:port'))

But it also does not seems to work. I was looking for a solution and found below solution and I used it with the chrome extension proxy auto auth and without the chrome extension

proxy = {'address': settings.PROXY,
             'username': settings.PROXY_USER,
             'password': settings.PROXY_PASSWORD}

capabilities = dict(DesiredCapabilities.CHROME)
capabilities['proxy'] = {'proxyType': 'MANUAL',
                             'httpProxy': proxy['address'],
                             'ftpProxy': proxy['address'],
                             'sslProxy': proxy['address'],
                             'noProxy': '',
                             'class': "org.openqa.selenium.Proxy",
                             'autodetect': False,
                             'socksUsername': proxy['username'],
                             'socksPassword': proxy['password']}
options.add_extension(os.path.join(settings.DIR, "extension_2_0.crx")) # proxy auth extension

but neither of above worked properly it seems working because after above code the proxy authentication alert disappeared and when I checked my IP by googling what is my IP and confirmed that is not working.

please anyone who can help me to authenticate the proxy server on chromedriver.


Solution

  • Selenium Chrome Proxy Authentication

    Setting chromedriver proxy with Selenium using Python

    If you need to use a proxy with python and Selenium library with chromedriver you usually use the following code (Without any username and password:

    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument('--proxy-server=%s' % hostname + ":" + port)
    driver = webdriver.Chrome(chrome_options=chrome_options)
    

    It works fine unless proxy requires authentication. if the proxy requires you to log in with a username and password it will not work. In this case, you have to use more tricky solution that is explained below. By the way, if you whitelist your server IP address from the proxy provider or server it should not ask proxy credentials.

    HTTP Proxy Authentication with Chromedriver in Selenium

    To set up proxy authentication we will generate a special file and upload it to chromedriver dynamically using the following code below. This code configures selenium with chromedriver to use HTTP proxy that requires authentication with user/password pair.

    import os
    import zipfile
    
    from selenium import webdriver
    
    PROXY_HOST = '192.168.3.2'  # rotating proxy or host
    PROXY_PORT = 8080 # port
    PROXY_USER = 'proxy-user' # username
    PROXY_PASS = 'proxy-password' # password
    
    
    manifest_json = """
    {
        "version": "1.0.0",
        "manifest_version": 2,
        "name": "Chrome Proxy",
        "permissions": [
            "proxy",
            "tabs",
            "unlimitedStorage",
            "storage",
            "<all_urls>",
            "webRequest",
            "webRequestBlocking"
        ],
        "background": {
            "scripts": ["background.js"]
        },
        "minimum_chrome_version":"22.0.0"
    }
    """
    
    background_js = """
    var config = {
            mode: "fixed_servers",
            rules: {
            singleProxy: {
                scheme: "http",
                host: "%s",
                port: parseInt(%s)
            },
            bypassList: ["localhost"]
            }
        };
    
    chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
    
    function callbackFn(details) {
        return {
            authCredentials: {
                username: "%s",
                password: "%s"
            }
        };
    }
    
    chrome.webRequest.onAuthRequired.addListener(
                callbackFn,
                {urls: ["<all_urls>"]},
                ['blocking']
    );
    """ % (PROXY_HOST, PROXY_PORT, PROXY_USER, PROXY_PASS)
    
    
    def get_chromedriver(use_proxy=False, user_agent=None):
        path = os.path.dirname(os.path.abspath(__file__))
        chrome_options = webdriver.ChromeOptions()
        if use_proxy:
            pluginfile = 'proxy_auth_plugin.zip'
    
            with zipfile.ZipFile(pluginfile, 'w') as zp:
                zp.writestr("manifest.json", manifest_json)
                zp.writestr("background.js", background_js)
            chrome_options.add_extension(pluginfile)
        if user_agent:
            chrome_options.add_argument('--user-agent=%s' % user_agent)
        driver = webdriver.Chrome(
            os.path.join(path, 'chromedriver'),
            chrome_options=chrome_options)
        return driver
    
    def main():
        driver = get_chromedriver(use_proxy=True)
        #driver.get('https://www.google.com/search?q=my+ip+address')
        driver.get('https://httpbin.org/ip')
    
    if __name__ == '__main__':
        main()
    

    Function get_chromedriver returns configured selenium webdriver that you can use in your application. This code is tested and works just fine.

    Read more about onAuthRequired event in Chrome.