Search code examples
pythongoogle-chromeselenium-webdriver

Using multiple client certificates with Python and Selenium


I’m working on a web-scrape project using Python and Selenium with a Chrome driver, which requires client certificates to access pages. I have 2 scenarios it must handle:

  1. Different certificates allow access to different URLs (e.g. Certificate A accesses URLs 1, 2 and 3, and Certificate B accesses URLs 4, 5 and 6)
  2. Multiple certificates can access the same URL (e.g. Certificate A and B both can access URLs 7, 8 and 9 – those URLs return different company-specific data with each different cert)

I’m on Windows/Windows Server, and have used the Registry entry AutoSelectCertificateForUrls, which auto-selects a certificate, based on URL (or wildcard). But for scenario #2 above, it does no good.

So ideally, I’d like to pass the URL and Cert name to the Python script, then have Chrome use that Cert when accessing the specified URL, but I’m not seeing a way to do that. So far, I have:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait, Select

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--allow-insecure-localhost')
chrome_options.add_argument('--ignore-ssl-errors=yes')
chrome_options.add_argument('--ignore-certificate-errors')
driver = webdriver.Chrome() 

driver.get(url)
:
:
# scrape code here

Does anyone have good step-by-step instructions to handle this?


Solution

  • import sqlite3
    import win32crypt
    from selenium import webdriver
    from selenium.webdriver.chrome.service import Service
    from selenium.webdriver.chrome.options import Options
    
    DATABASE_PATH = 'path/to/database.db'  # Database with URLs and cert thumbprints
    CHROMEDRIVER_PATH = 'path/to/chromedriver'
    
    def fetch_thumbprint_for_url(url):
        conn = sqlite3.connect(DATABASE_PATH)
        cursor = conn.cursor()
        cursor.execute("SELECT thumbprint FROM certs WHERE url = ?", (url,))
        result = cursor.fetchone()
        conn.close()
        return result[0] if result else None
    
    def get_cert_from_store(thumbprint):
        store = win32crypt.CERT_SYSTEM_STORE_CURRENT_USER
        store_handle = win32crypt.CertOpenStore(win32crypt.CERT_STORE_PROV_SYSTEM, 0, None, store, "MY")
        cert_context = win32crypt.CertFindCertificateInStore(
            store_handle,
            win32crypt.X509_ASN_ENCODING,
            0,
            win32crypt.CERT_FIND_HASH,
            thumbprint,
            None
        )
        if cert_context:
            return cert_context[0].get("CERT_CONTEXT")
        raise Exception("Certificate not found.")
    
    def setup_driver():
        chrome_options = Options()
        chrome_options.add_argument('--allow-insecure-localhost')
        chrome_options.add_argument('--ignore-ssl-errors=yes')
        chrome_options.add_argument('--ignore-certificate-errors')
        service = Service(CHROMEDRIVER_PATH)
        return webdriver.Chrome(service=service, options=chrome_options)
    
    def access_url_with_cert(url):
        thumbprint = fetch_thumbprint_for_url(url)
        if not thumbprint:
            raise Exception("No thumbprint found for this URL.")
        cert = get_cert_from_store(thumbprint)
        if not cert:
            raise Exception("Certificate retrieval failed.")
        driver = setup_driver()
        driver.get(url)
        return driver
    
    if __name__ == "__main__":
        test_url = "https://example.com"  # Update with the actual URL
        driver = access_url_with_cert(test_url)
        driver.quit()