Search code examples
pythonjsonurlliburlopen

Python: KeyError/IOError with urllib.urlopen


I am trying to pass some text to this readability API like so:

text = 'this reminds me of the Dutch 2001a caravan full of smoky people Auld Lang Syne'
# construct Readability Metrics API url
request_url = 'http://ipeirotis.appspot.com/readability/GetReadabilityScores?format=json&text=%s' % text
request_url = urllib.quote_plus(request_url.encode('utf-8'))
# make request
j = json.load(urllib.urlopen(request_url))

I get this error on the last line though:

[Errno 2] No such file or directory: 'http://ipeirotis.appspot.com/readability/GetReadabilityScores?format=json&text=this+reminds+me+of+the+Dutch+2001a+caravan+full+of+smoky+people+Auld+Lang+Syne'

However, the URL in the error is valid and returns a response when you visit it. How do I encode the URL so that I can use urlopen? Thanks a lot.


Solution

  • You are quoting the full url, including the http:// and what not. If you try to print the actually value of request_url, you get

    >>> print request_url
    http%3A%2F%2Fipeirotis.appspot.com%2Freadability%2FGetReadabilityScores%3Fformat
    %3Djson%26text%3Dthis+reminds+me+of+the+Dutch+2001a+caravan+full+of+smoky+people
    +Auld+Lang+Syne
    

    Which is not what you want. You only want to quote the parts that you want to be a single argument to the website. I tried the following and it seemed to work:

    text = 'this reminds me of the Dutch 2001a caravan full of smoky people Auld Lang Syne'
    # construct Readability Metrics API url
    request_url = 'http://ipeirotis.appspot.com/readability/GetReadabilityScores?format=json&text=%s' % urllib.quote_plus(text.encode('utf-8'))
    # make request
    j = json.load(urllib.urlopen(request_url))