Search code examples
pythonherokuporttwistedscrapyd

How to bind Heroku port to scrapyd


I created a simple python app on Heroku to launch scrapyd. The scrapyd service starts, but it launches on port 6800. Heroku requires you to bind it the $PORT variable, and I was able to run the heroku app locally. The logs from the process are included below. I looked at a package scrapy-heroku, but wasn't able to install it due to errors. The code in app.py of this package seems to provide some clues as to how it can be done. How can I implement this as a python command to start scrapyd on the port provided by Heroku?

Procfile:

web: scrapyd

Heroku Logs:

2022-01-24T05:17:27.058721+00:00 app[web.1]: 2022-01-24T05:17:27+0000 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 21.7.0 (/app/.heroku/python/bin/python 3.10.2) starting up.
2022-01-24T05:17:27.058786+00:00 app[web.1]: 2022-01-24T05:17:27+0000 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.
2022-01-24T05:17:27.059190+00:00 app[web.1]: 2022-01-24T05:17:27+0000 [-] Site starting on 6800
2022-01-24T05:17:27.059301+00:00 app[web.1]: 2022-01-24T05:17:27+0000 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x7f1706e3eaa0>
2022-01-24T05:17:27.059649+00:00 app[web.1]: 2022-01-24T05:17:27+0000 [Launcher] Scrapyd 1.3.0 started: max_proc=32, runner='scrapyd.runner'
2022-01-24T05:18:25.204305+00:00 heroku[web.1]: Error R10 (Boot timeout) -> Web process failed to bind to $PORT within 60 seconds of launch
2022-01-24T05:18:25.231596+00:00 heroku[web.1]: Stopping process with SIGKILL
2022-01-24T05:18:25.402503+00:00 heroku[web.1]: Process exited with status 137

Solution

  • You just need to read the PORT environment variable and write it into your scrapyd config file. You can check out this code that does the same.

    # init.py
    import os
    import io
    
    PORT = os.environ['PORT']
    with io.open("scrapyd.conf", 'r+', encoding='utf-8') as f:
        f.read()
        f.write(u'\nhttp_port = %s\n' % PORT)
    

    Source: https://github.com/scrapy/scrapyd/issues/367#issuecomment-591446036