Search code examples
pythonpython-2.7scrapystack-overflow

Scrapy stack overflow of Requests


I have the following while loop to scrap pages

def after_login(self, response):
    i=100000
    while (i<2000000): 
       yield scrapy.Request("https://www.example.com/foobar.php?nr="+str(i),callback=self.another_login)
       i+=1

The problem is that the process gets killed because of stack overflow. Is there a way to tell the while loop to queue 1000 requests and when those are done to queue another 1000?


Solution

  • You should play around with Scrapy settings. For instance, try decreasing the CONCURRENT_REQUESTS, adding DOWNLOAD_DELAY.

    If that does not help, look into debugging the memory usage, see also: