Search code examples
pythonscrapyweb-crawlerrobots.txtscrapy-shell

How to disable robots.txt when you launch scrapy shell?


I use Scrapy shell without problems with several websites, but I find problems when the robots (robots.txt) does not allow access to a site. How can I disable robots detection by Scrapy (ignored the existence)? Thank you in advance. I'm not talking about the project created by Scrapy, but Scrapy shell command: scrapy shell 'www.example.com'


Solution

  • In the settings.py file of your scrapy project, look for ROBOTSTXT_OBEY and set it to False.