Search code examples
cadence-workflow

how to design a stateless worker to process message only once using Uber Cadence


please help us to adopt Cadence :D

Here is the current design. Some stateless workers pull messages from a centralized queue to process it. Complex business logics are involved in worker as well as Deduper feature which utilizes separate Redis cluster as remote distributed cache (strong consistency using consensus). This cache only stores message ids and their statuses either "in progress", "completed" and "not started". Obviously, worker is expected to process non-completed message.

Personally I would like to rethink all possible solutions. Workflow model comes to my mind, because I have pleasant experience with AWS SWF. As all our services are written in go and running on our own data center, I would like to try Uber Cadence (open source of SWF).

I watched many videos from Uber users and I think first step is to have one activity in a new workflow as a start and then break it down to multiple activities, or maybe AWS lambda later once we migrate it to AWS.

So I list all requirements here

  1. avoid processing a message twice by multiple workers.
  2. 50k req/s so need scalable solution
  3. low latency on p99, < 300ms hopefully

Only first requirement is a headache for now, since Redis cache is a remote cache cluster. There are some connectivities issue in prod and we really want to get rid of it to avoid complexity and extra network hops.

Questions:

  1. So I wonder how to design deduper when switching to Cadence?

By reading the doc, Cadence provides a workflow ID uniqueness feature inside a domain. Maybe I can use message ID as part of workflow ID for example, WF-00001, to guarantee no duplicates inside a domain. There will be no issue as long as I only use one domain. Then I don't know the limitation of this approach. for example, the number of workflows allowed inside a domain. we have 50k messages processing rate /s (peak)

I am not sure if this is the correct approach. More ideas are welcome.

  1. Is there a web page listing all limitations of Cadence ? we need it to evaluate Cadence.

Thanks

SWF Step Function Uber Cadence


Solution

  • On a high level Cadence is a good fit for your use case.

    1. Deduper is pretty simple. Workflow keeps a map of recent request ids (or all requests that belong to the given workflowID if their number is bounded) and performs duplication check against it.

    2. Most Cadence limits are deployment specific and configurable. Let's discuss your specific use case at Slack.