I have a fleet of multiple worker hosts polling for the following tasks of my SWF:
The file generated in Step-1 needs to be used again in Step-3, and then eventually discarded at the end of the workflow.
The system would work fine if there is only 1 host polling for all tasks. However, when I have multiple workers, I cannot seem to ensure that task-1 and task-3 would end up on the same host.
I would like to avoid doing the following:
I have the following questions:
Yes, absolutely. The basic idea is that SWF task lists (queues used to deliver activity tasks) are dynamic. So each host can have its own task list and workflow can specify specific task list name when calling an activity. See fileprocessing sample which executes download activity on any host from the pool, then converts the file and uploads the result on the same host as the first one.
The approach of caching result in the worker process memory or on the local disk is considered the best practice. Sometimes using external data store and getting it each times also makes sense.