Search code examples
dockerscrapydocker-compose

Running Scrapy in a docker container


I am setting up a new application which I would like to package using docker-compose. Currently in one container I have a Flask-Admin application which also exposes a API for interacting with the database. I then will have lots of scrapers that need to run once a day. These scrapers should scrape the data, reformat the data and then send it to the API. I expect I should have another docker container running for the scrapers.

Currently, on my local machine I run Scrapy run-spider myspider.py to run each spider.

What would be the best way to have multiple scrapers in one container and have them scheduled to run at various points during the day?


Solution

  • You could configure your docker container that has the scrapers to use "cron" to fire off the spiders at appropriate times. Here's an example:"Run a cron job with Docker"