Search code examples
pythonsslpython-requests

Python requests Library SSL error: [Errno 2] No such file or directory


first ever question: I'm getting the following result:

File "D:\Anaconda\Lib\site-packages\requests\api.py", line 70, in get return request('get', url, params=params, **kwargs)

File "D:\Anaconda\Lib\site-packages\requests\api.py", line 56, in request return session.request(method=method, url=url, **kwargs)

File "D:\Anaconda\Lib\site-packages\requests\sessions.py", line 475, in request resp = self.send(prep, **send_kwargs)

File "D:\Anaconda\Lib\site-packages\requests\sessions.py", line 596, in send r = adapter.send(request, **kwargs)

File "D:\Anaconda\Lib\site-packages\requests\adapters.py", line 497, in send raise SSLError(e, request=request)

requests.exceptions.SSLError: [Errno 2] No such file or directory

This traces back to one line of code here:

import requests, os, bs4, calendar #, sys
import urllib.request

while not year>2016:
    print('Downloading page {}...'.format(url))

    res = requests.get(loginpageURL, verify='false', auth=('username', 'password')) #this is the line that doesn't work
    res = requests.get(url, verify='false') #but I have tried it without that line and this line also doesn't work
    res.raise_for_status()

    soup = bs4.BeautifulSoup(res.text)
    print(soup)

I have researched the issue extensively, and come to the conclusion that it is actually an issue with the requests/urllib3 libraries themselves.

At first, I tried the verify='false' fix here. It didn't work. Someone here said to install new openSSL and certifi, they appear to be installed and up to date on my system. Found the bug has a great writeup on here. No solution from what I could see. It has been identified on github as a known issue here.

When, according to this answer, I tried to change verify='false' to verify='cacert.pem' (which I included in the project directory), it threw this error: requests.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)

Now I'm sitting here just wanting to get this one code snippet to run - I'm trying to bulk download a few hundred zip files from a website - in spite of the known issue with the library. I'm relatively new to python, but especially new to web scraping, so this is a steep learning curve for me. Any help would be appreciated. Do I need to go so far as scrapping requests?

Thanks!


Solution

  • res = requests.get(loginpageURL, verify='false', ...
    

    Verify takes either a boolean (i.e. True or False) or a path which is then used as path for the trust store. Your specification 'false' is a string and not a boolean and it will therefore try to use the file false as CA store. This file cannot be found and thus results in No such file or directory.

    To fix this you have to use verify=False, i.e. use the boolean value.

    Apart from that disabling the validation is a bad idea and should only be done for testing or when the security offered by TLS is completely irrelevant for the program. For a login page like in your case disabling validation is probably a bad thing because a man in the middle can thus easily sniff the username and password.