Search code examples
pythonscrapyrecaptcha

Using meta attribute inside a downloader middleware


According to the Scrapy docs, one cannot use response.request in a downloader middleware because request object will be attached to response only after passing all other downloader middlewares. Though I have noticed that in case of redirect (to captcha page), responses inside a downloader middleware have not only empty request field - but also empty meta (pycharm debugger tells me that the response is not related with any request). How could I force Scrapy to keep the meta while processing inside a downloader middleware? I have placed meta=response.meta to every request but still getting errors about missing meta keys - and absent meta attribute aswell.

    def start_requests(self):

        for value in values::
            yield Request(
                self.SEARCH_URL,
                                ),
                meta={'ssomekey': value},
            )

From downloaders middleware:

    def process_response(self, request, response, spider):

        if not hasattr(response, 'meta'):
            print "there is no meta"

After launching immediately prints "there is no meta"


Solution

  • The request object is available as an argument to the process_response method of downloader middlewares, not just process_request. Like @paul-trmrth suggests, instead of response.meta or response.request.meta, use request.meta, and this is propagated through all middlewares on both ends of download and through to the spider.

    https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#scrapy.downloadermiddlewares.DownloaderMiddleware.process_response

    Sorry to necro, but I had the same question and found an answer.