Search code examples
pythonherokunotificationsworker

Ironworker job done notification


I'm writing python app which currently is being hosted on Heroku. It is in early development stage, so I'm using free account with one web dyno. Still, I want my heavier tasks to be done asynchronously so I'm using iron worker add-on. I have it all set up and it does the simplest jobs like sending emails or anything that doesn't require any data being sent back to the application. The question is: How do I send the worker output back to my application from the iron worker? Or even better, how do I notify my app that the worker is done with the job?

I looked at other iron solutions like cache and message queue, but the only thing I can find is that I can explicitly ask for the worker state. Obviously I don't want my web service to poll the worker because it kind of defeats the original purpose of moving the tasks to background. What am I missing here?


Solution

  • I see this question is high in Google so in case you came here with hopes to find some more details, here is what I ended up doing:

    First, I prepared the endpoint on my app. My app uses Flask, so this is how the code looks:

    @app.route("/worker", methods=["GET", "POST"])
    def worker():
    #refresh the interface or whatever is necessary
        if flask.request.method == 'POST':
            return 'Worker endpoint reached'
        elif flask.request.method == 'GET':
            worker = IronWorker()
            task = worker.queue(code_name="hello", payload={"WORKER_DB_URL": app.config['WORKER_DB_URL'],
                                "WORKER_CALLBACK_URL": app.config['WORKER_CALLBACK_URL']})
            details = worker.task(task)
            flask.flash("Work queued, response: ", details.status)
            return flask.redirect('/')
    

    Note that in my case, GET is here only for testing, I don't want my users to hit this endpoint and invoke the task. But I can imagine situations when this is actually useful, specifically if you don't use any type of scheduler for your tasks.

    With the endpoint ready, I started to look for a way of visiting that endpoint from the worker. I found this fantastic requests library and used it in my worker:

    import sys, json
    from sqlalchemy import *
    import requests
    
    print "hello_worker initialized, connecting to database..."
    
    payload = None
    payload_file = None
    for i in range(len(sys.argv)):
        if sys.argv[i] == "-payload" and (i + 1) < len(sys.argv):
            payload_file = sys.argv[i + 1]
            break
    
    f = open(payload_file, "r")
    contents = f.read()
    f.close()
    
    payload = json.loads(contents)
    
    print "contents: ", contents
    print "payload as json: ", payload
    
    db_url = payload['WORKER_DB_URL']
    
    print "connecting to database ", db_url
    
    db = create_engine(db_url)
    metadata = MetaData(db)
    
    print "connection to the database established"
    
    users = Table('users', metadata, autoload=True)
    s = users.select()
    
    #def run(stmt):
    #    rs = stmt.execute()
    #    for row in rs:
    #        print row
    
    #run(s)
    
    callback_url = payload['WORKER_CALLBACK_URL']
    print "task finished, sending post to ", callback_url
    r = requests.post(callback_url)
    print r.text
    

    So in the end there is no real magic here, the only important thing is to send the callback url in the payload if you need to notify your page when the task is done. Alternatively you can place the endpoint url in the database if you use one in your app. Btw. the snipped above also shows how to connect to the postgresql database in your worker and print all the users.

    One last thing you need to be aware of is how to format your .worker file, mine looks like this:

    # set the runtime language. Python workers use "python"
    runtime "python"
    # exec is the file that will be executed:
    exec "hello_worker.py"
    # dependencies
    pip "SQLAlchemy"
    pip "requests"
    

    This will install the latest versions of SQLAlchemy and requests, if your project is dependent on any specific version of the library, you should do this instead:

    pip "SQLAlchemy", "0.9.1"