Search code examples
pythonselenium-chromedriverazure-databricksselenium-firefoxdriver

Chrome driver issue in data bricks


I am facing Web driver exception while running below code. Code:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.options import Options

#driver = webdriver.Chrome()
#print(driver.service.executable_path)
chrome_options= Options()
chrome_options.add_argument("--headless")
chrome_options.binary_location = "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/chromedriver_binary/chromedriver"  # Replace with the actual path to your Chrome binary
driver = webdriver.Chrome(options=chrome_options)
print(driver.service.process.path)

Error:

ebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
  (unknown error: DevToolsActivePort file doesn't exist)
  (The process started from chrome location /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/chromedriver_binary/chromedriver is no longer running, so ChromeDriver is assuming that Chrome has crashed.)

below are the libraries installed on cluster: enter image description here

have tried multiple things and tried to download manually chrome and driver but still no luck. After that have tried to do using firefox but for that getting below issue. Please suggest how to achieve this problem.

Firefox code:

%sh 
wget https://github.com/mozilla/geckodriver/releases/download/v0.24.0/geckodriver-v0.24.0-linux64.tar.gz -O /tmp/geckodriver.tar.gz

%sh
tar -xvzf /tmp/geckodriver.tar.gz -C /tmp

%sh
ls /tmp/gec*

%sh 
/usr/bin/yes | sudo apt update --fix-missing > /dev/null 2>&1

%sh 
sudo apt-get --yes --force-yes install firefox > /dev/null 2>&1

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options =Options()
options.headless =True
driver=webdriver.Firefox(options=options, executable_path ='/tmp/geckodriver')

Error :

<command-1051731704975926>:6: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
  driver=webdriver.Firefox(options=options, executable_path ='/tmp/geckodriver')
WebDriverException: Message: Service /tmp/geckodriver unexpectedly exited. Status code was: 1

Solution

  • you can follow this approach.

    Run below code for setting up options.

    from selenium import webdriver
    from selenium.webdriver import Chrome
    from selenium.webdriver.chrome.service import Service
    from selenium.webdriver.chrome.options import Options
    
    options = Options()
    options.add_argument('--headless')
    options.add_argument("--no-sandbox")
    options.add_argument("--disable-dev-shm-usage")
    options.add_argument("--disable-gpu")
    binary_path = "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/chromedriver_binary/chromedriver"
    

    Next, run below for installing stable chrome version.

    %sh
    sudo  curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add
    sudo  echo  "deb https://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
    sudo apt-get -y update
    sudo apt-get -y install google-chrome-stable
    

    enter image description here

    Here, you can see the chrome version is 114.0.5735, it is same with the chromebinary you installed in cluster.

    enter image description here

    Then, get the browser object.

    browser = webdriver.Chrome(service=Service(binary_path), options=options)
    print(browser.service.process.args)
    

    enter image description here