Search code examples
pythonherokuschedule

How to schedule a python script on Heroku


I am deploying a script (a Scrapy python one) on Heroku, and I want it to be launched 4 times in the morning.

I can definitely run it by connecting to my Heroku account (I have a free plan) and typing this on the windows command line:

heroku run scrapy crawl sytadin

But I am having some issues when I try to run it through Heroku Scheduler. It asks me if I want to write something like $ rake. I never used rake before, is it something to use before run or after run? Should I use the keyword heroku first?

I have no idea, and everything I tried failed, as I can see in the log :

2017-01-19T23:47:05.305039+00:00 heroku[scheduler.3450]: Starting process with command `python "sytadin" crawl`
2017-01-19T23:47:05.974030+00:00 heroku[scheduler.3450]: State changed from starting to up
2017-01-19T23:47:08.335845+00:00 heroku[scheduler.3450]: State changed from up to complete
2017-01-19T23:47:08.204289+00:00 app[scheduler.3450]: /app/.heroku/python/bin/python: can't find '__main__' module in 'sytadin'
2017-01-19T23:47:08.326081+00:00 heroku[scheduler.3450]: Process exited with status 1
2017-01-19T23:48:27.681890+00:00 app[api]: Starting process with command `python sytadin/sytadin.py crawl` by user scheduler@addons.heroku.com
2017-01-19T23:48:35.571615+00:00 heroku[scheduler.6352]: Starting process with command `python sytadin/sytadin.py crawl`
2017-01-19T23:48:36.156250+00:00 heroku[scheduler.6352]: State changed from starting to up
2017-01-19T23:48:37.424920+00:00 heroku[scheduler.6352]: Process exited with status 2
2017-01-19T23:48:37.360306+00:00 app[scheduler.6352]: python: can't open file 'sytadin/sytadin.py': [Errno 2] No such file or directory
2017-01-19T23:48:37.445476+00:00 heroku[scheduler.6352]: State changed from up to complete

As you can see I tried different possibilities I found on the web, but it doesn't work properly :(

Any guess for my python script? :)


Solution

  • The Heroku scheduler basically just does heroku run + whatever command you type there.

    So, in your case, since your scrapy crawler successfully runs when you do: heroku run scrapy crawl sytadin, you can create a scheduler rule to run:

    scrapy crawl sytadin
    

    And that will do the trick =)