I am using Scrapy and would like to be able to check my database for a should_continue
flag and raise a CloseSpider exception if it's false. However, according to the documentation here: http://doc.scrapy.org/en/latest/topics/exceptions.html, CloseSpider can only be called from parse
or parse_item
.
I could add a function in each parse
and parse_item
for each spider, but that goes against DRY principals. Can I somehow create a parse
and parse_item
middleware that is always called before those functions are called?
I couldn't get it to trigger using DOWNLOADER_MIDDLEWARE or SPIDER_MIDDLEWARE, whats the correct way to do this?
The only thing Scrapy does when CloseSpider
is raised is call the close_spider()
method of the execution engine: https://github.com/scrapy/scrapy/blob/master/scrapy/core/scraper.py#L152-L153
You can just call that method yourself to achieve the same result.
This is also what the CloseSpider extension does: https://github.com/scrapy/scrapy/blob/master/scrapy/extensions/closespider.py