Situation:
I want to use AWS SWF to coordinate long running manual activities. When activity is being scheduled in AWS I transfer it to DB to show on UI what tasks are pending. Those tasks can take weeks to complete, thus they have huge timeouts in SWF.
Problem:
In case my application fails to populate DB (hangs or dies without reporting any error), then task is not seen by a person and retry can only happen after weeks, when activity times out (which is obviously unacceptable).
Question:
So I would want to have an ability to "start" the task (say having timeout of 30 seconds) and when application is sure that activity is now started set timeout to weeks. Is it really possible to do it elegantly using SWF?
(I've read through doc and several examples and still don't understand what is the envisioned way of running manual tasks)
Unfortunately the SWF service doesn't support "start activity task" API call. The workaround I used was to use an activity with a short timeout to insert the record into a DB. Then upon the manual task completion signal workflow about it. A separate timer was needed to deal with the manual task timeout. All this logic can be encapsulated in a separate class for reuse.
Added benefit of using signal is that manual tasks usually have more than one state. For example workflow can be signaled when task is claimed and later released back. Each state can have a different timeout.
[Edit: Added strawman ManualActivityClient example]
public class ManualActivityClient {
private final Map<String, Settable<Void>> outstandingManualActivities = new HashMap<>();
private StartManualActivityClient startActivityClient;
private WorkflowClock clock;
public Promise<Void> invoke(String id, String activityArgs, long timeout) {
Promise<Void> started = startActivityClient.start(id, activityArgs);
Settable<Void> completionPromise = new Settable<>();
outstandingManualActivities.put(id, completionPromise);
// TryFinally is used to define cancellation scope for automatic timer cancellation.
new TryFinally() {
@Override
protected void doTry() throws Throwable {
// Wrap timer invocation in Task(true) to give it daemon flag. Daemon tasks are automatically
// cancelled when all other tasks in the same scope (defined by doTry) are done.
new Task(true) {
@Override
protected void doExecute() throws Throwable {
Promise<Void> manualActivityTimeout = clock.createTimer(timeout);
new Task(manualActivityTimeout) {
@Override
protected void doExecute() throws Throwable {
throw new TimeoutException("Manual activity " + id + " timed out");
}
};
}
};
// This task is used to "wait" for manual task completion. Without it the timer would be
// immediately cancelled.
new Task(completionPromise) {
@Override
protected void doExecute() throws Throwable {
// Intentionally empty
}
};
}
@Override
protected void doFinally() throws Throwable {
}
};
return completionPromise;
}
public void signalManualActivityCompletion(String id) {
// Set completionPromise to ready state
outstandingManualActivities.get(id).set(null);
}
}
And this class can be used as:
@Workflow(...)
public class ManualActivityWorkflow {
private ManualActivityClient manualActivityClient;
@Execute(...)
public void execute() {
// ...
Promise<Void> activity1 = manualActivityClient.invoke("activity1", "someArgs1", 300);
Promise<Void> activity2 = manualActivityClient.invoke("activity2", "someArgs2", 300);
// ...
}
@Signal(...)
public void signalManualActivityCompletion(String id) {
manualActivityClient.signalManualActivityCompletion(id);
}
}