Search code examples
ssltornado

try a spider by tornado use proxy, SSL error happen


I run a spider wrote by tornado like https://github.com/tornadoweb/tornado/blob/master/demos/webspider/webspider.py,of course ,change the httpclient.AsyncHTTPClient to curl_httpclient.CurlAsyncHTTPClient by

httpclient.AsyncHTTPClient.configure('tornado.curl_httpclient.CurlAsyncHTTPClient')

the spider run on the windows 10.python3+,64.

It is sad that error come:

tornado.curl_httpclient.CurlError: HTTP 599: SSL certificate problem: unable to get local issuer certificate

Anyone saw this? I searched it in the google, but for the demoe of tornado in spider are not much enough, I did not found a answer?

Or anyone can tell me something about the error?


Solution

  • try to overide a method in curl_httpclient.CurlAsyncHTTPClient

    curl_log = logging.getLogger('tornado.curl_httpclient')
    class PersonAsyncHTTPClient(curl_httpclient.CurlAsyncHTTPClient):
        def _curl_create(self):
            curl = pycurl.Curl()
    
            curl.setopt(pycurl.CAINFO, certifi.where()) # the soure had no this line.missing this line would come ssl error.
    
            if curl_log.isEnabledFor(logging.DEBUG):
                curl.setopt(pycurl.VERBOSE, 1)
                curl.setopt(pycurl.DEBUGFUNCTION, self._curl_debug)
            if hasattr(pycurl, 'PROTOCOLS'):  # PROTOCOLS first appeared in pycurl 7.19.5 (2014-07-12)
                curl.setopt(pycurl.PROTOCOLS, pycurl.PROTO_HTTP | pycurl.PROTO_HTTPS)
                curl.setopt(pycurl.REDIR_PROTOCOLS, pycurl.PROTO_HTTP | pycurl.PROTO_HTTPS)
            return curl
    

    and when we use:

    http_client = PersonAsyncHTTPClient()
    req = httpclient.HTTPRequest(url='https://www.google.com.hk/', proxy_host='', proxy_port=1234)
    

    everything come success?

    that's all~