Search code examples
python-2.7web-scrapingscrapyscrapyd

Check 500 error to by pass


I use Scrapy framework to crawl data. My crawler will be interrupted if it encounters a 500 error. So I need to check an available link before I parse a web content.
Is there any approach to resolve my problem?
Thank you so much.


Solution

  • If the url exists you could use the getcode() method of urllib to check it:

    import urllib
    import sys
    
    webFile = urllib.urlopen('http://www.some.url/some/file')
    returnCode = webFile.getCode()
    
    if returnCode == 500:
      sys.exit()
    
    # in other case do something.