Is there a way to get the instance of the spider that runs when you schedule a run using scrapyd? I need to access attributes in the spider to handle outside the run and can't use a json/csv file to do this.
I found what I needed in here: using the spider_close
method to run the code I need right before the spider closes.
You need to add the following to the pipeline's __init__
method (otherwise it never receives the spider_closed
signal):
dispatcher.connect(self.spider_opened, signals.spider_opened)
dispatcher.connect(self.spider_closed, signals.spider_closed)