I've recently switched computers and I have a script to scrape a site that I'm trying to port over, but it's not working. I'm running Mozilla Firefox 93.0
, geckodriver 0.30.0 (d372710b98a6 2021-09-16 10:29 +0300)
with Python 3.8.10
on Windows Subsystem for Linux
geckodriver.log is the following:
1635724994219 geckodriver INFO Listening on 127.0.0.1:34993
1635724994224 mozrunner::runner INFO Running command: "/usr/bin/firefox" "--marionette" "--headless" "--remote-debugging-port" "45483" "-no-remote" "-profile" "/tmp/rust_mozprofileKcEU8P"
*** You are running in headless mode.
1635724994420 Marionette INFO Marionette enabled
[GFX1-]: RenderCompositorSWGL failed mapping default framebuffer, no dt
console.warn: SearchSettings: "get: No settings file exists, new profile?" (new NotFoundError("Could not open the file at /tmp/rust_mozprofileKcEU8P/search.json.mozlz4", (void 0)))
DevTools listening on ws://localhost:45483/devtools/browser/ef680f3f-d655-4d3d-86be-7287f5731e16
1635724995327 Marionette INFO Listening on port 38833
JavaScript error: resource://services-settings/Attachments.jsm, line 391: TypeError: / is not a valid URL.
1635724995445 RemoteAgent WARN TLS certificate errors will be ignored for this session
1635724995448 RemoteAgent INFO Proxy settings initialised: {"proxyType":"manual","httpProxy":"127.0.0.1:46485","sslProxy":"127.0.0.1:46485"}
1635724996122 Marionette WARN Ignoring event 'pageshow' because document has an invalid readyState of 'interactive'.
1635725002780 Marionette WARN Ignoring event 'pageshow' because document has an invalid readyState of 'interactive'.
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
while the program is repeatedly throwing the following error up on the command line.
refresh_site() 482 https://www.xkcd.com/
Message: Reached error page: about:neterror?e=nssFailure2&u=https%3A//www.xkcd.com/&c=UTF-8&d=The%20connection%20to%20the%20server%20was%20reset%20while%20the%20page%20was%20loading.
Stacktrace:
WebDriverError@chrome://remote/content/shared/webdriver/Errors.jsm:181:5
UnknownError@chrome://remote/content/shared/webdriver/Errors.jsm:488:5
checkReadyState@chrome://remote/content/marionette/navigate.js:64:24
onNavigation@chrome://remote/content/marionette/navigate.js:312:39
emit@resource://gre/modules/EventEmitter.jsm:160:20
receiveMessage@chrome://remote/content/marionette/actors/MarionetteEventsParent.jsm:42:25
I'm familiar with a fair amount of geckodriver errors from previous experience and have usually been able to fix them by reinstalling firefox and geckodriver with matching versions, but this is a new one for me and I don't know what I should do to proceed. Thoughts?
Edit:
For the record, I can initialize the webdriver without error, but when I uncomment out the lin self.driver.get(self.user_site)
that resuts in an error being thrown every time.
Edit 2:
I suspect it has something to do with the commands to firefox, since on the computer where it works, the log show that it's sending the command as "/usr/bin/firefox" "--marionette" "--headless" "-foreground" "-no-remote" "-profile" "/tmp/rust_mozprofiledAb1T0"
which is different from what my new computer is doing, but I don't know enough Selenium to fix that off the top of my head.
Edit 3:
I think this is a security certificate issue. I ran the following as a python script and it worked fine.
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("https://dev.to")
driver.find_element_by_id("nav-search").send_keys("Selenium")
It still works when I swap in the url that I actually care for, but I get security errors when trying to open in it non-headless mode with my production code.
Edit4:
This code replicates the issue and shows that it can be fixed by changing seleniumwire to selenium
from seleniumwire import webdriver
class Foo:
def __init__(self):
self.web_options = webdriver.FirefoxOptions()
self.driver = webdriver.Firefox(options=self.web_options)
def bar(self):
self.driver.get("https://xkcd.com")
print(self.driver.current_url)
Foo().bar()
The actual error that's stopping seleniumwire is a AttributeError: module 'lib' has no attribute 'SSL_CTX_get0_param'
error, which is caused by https://pypi.org/project/cryptography/#history being installed as version 2.8 which is two years out of date. This seems like the ultimate answer which I am working on now.
Turns out my cryptography
package was out-of-date. I ran pip install pyopenssl
which fixed the issue.