Search code examples
multithreadingasynchronousdesign-patternslanguage-agnostic

Design pattern for checking asynchronous task dependencies before execution


The Problem

Given a number of asynchronously loaded dependencies, I want to trigger some code only after all dependencies are finished loading. As a simple example, consider the following pseudo-code:

bool firstLoaded = false, secondLoaded = false, thirdLoaded = false;

function loadResourceOne() {
    // Asynchronously, or in a new thread:
    HTTPDownload("one.txt");
    firstLoaded = true;
    if (secondLoaded && thirdLoaded) {
        allLoaded();
    }
}

function loadResourceTwo() {
    // Asynchronously, or in a new thread:
    HTTPDownload("two.txt");
    secondLoaded = true;
    if (firstLoaded && thirdLoaded) {
        allLoaded();
    }
}

function loadResourceThree() {
    // Asynchronously, or in a new thread:
    HTTPDownload("three.txt");
    thirdLoaded = true;
    if (firstLoaded && secondLoaded) {
        allLoaded();
    }
}

function allLoaded() {
    Log("Done!");
}

/* async */ loadResourceOne();
/* async */ loadResourceTwo();
/* async */ loadResourceThree();

What I'm Looking For

This is a problem that I've found myself having to solve repeatedly in different languages and in different contexts. However every time I find myself using the tools provided by the language to hack together some simple solution, like returning each asynchronous resource as a Promise in JavaScript then using Promise.all() -- or loading each resource in its own thread in Python then using threads.join()

I'm trying to find a design pattern that solves this problem in the general case. The best solution should meet two criteria:

  1. Can be applied to any language that supports asynchronous operations
  2. Minimizes repetition of code (note that in my simple example the line allLoaded(); was repeated three times, and the if statement preceding it was practically repeated, and wouldn't scale well if I need a fourth or fifth dependency)
  3. Runs the final callback as soon as possible when all resources are loaded -- this one is hopefully obvious, but solutions like "check that all are loaded every 5 seconds" aren't acceptable

I tried flipping through the index of the Gang of Four's Design Patterns, but the few pattern names that jumped out at me as possible leads turned out to be unrelated.


Solution

  • I tried flipping through the index of the Gang of Four's Design Patterns, but the few pattern names that jumped out at me as possible leads turned out to be unrelated.

    This problem domain will require combining multiple design-patterns rather than a single design-pattern. Let's address the key requirements :

    1. A task should be able to know when the tasks it depends on are complete so that it can start executing immediately. This needs to be achieved without periodically polling the dependent tasks.
    2. Addition of new dependencies to a task needs to be possible without the need to keep adding new if-else style checks.

    For point 1, I would suggest that you take a look at the Observer pattern. The primary advantage of this pattern in your case would be that a task won't have to poll it's dependent tasks. Instead, each task that your task depends on will notify your task when it completes by calling the update method. The update method can be implemented intelligently to check against a pre-populated list of tasks that it depends on every-time the method is called. The moment all pre-configured list of tasks have called update, the task can launch it's worker (A thread for example).

    For point 2, I would suggest that you take a look at the Composite pattern. A Task has an array of dependent Task instances and an array of Task instances it depends on. If a task finishes execution, it calls update on each of the tasks in the array of tasks that depend on it. On the other hand, for a task to start executing, other tasks that it depends on will call it's update method.

    If I had to define the above approach in pseudo code, it would look something as follows :

    Task structure :
       array of dependents : [dependent Task instances that depend on this Task]
       array of dependencies : [Task instances this task depends on]
    
       function update(Task t) : 
           remove t from dependencies
           if(dependencies size == 0) 
              - start asynchronous activity (call executeAsynchronous)
    
        function executeAsynchronous() :
            - perform asynchronous work
            - on completion :
                 - iterate through dependent array
                   - call update on each Task in dependent array and pass it this Task
    
        function addDependent(Task t) :
           - add t to array of dependent tasks
    
        function addDependency(Task t) :
           - add t to array of dependencies 
    

    All said and done, don't go looking for a design pattern to solve your problem. Instead, come up with working code and work through it to improve its design.


    Note : There is a small but significant difference between a framework and a design-pattern. If the objective is to build a task-dependencies framework using design patterns, you are definitely going to need more than one design pattern. The above answer explains how to do this using the Gang of Four patterns. If the objective is to not reinvent the wheel, one can look at frameworks that already solve this problem.

    One such framework is the Spring Batch framework that allows you to define sequential flows and split flows which can be wired together into a job that defines the end to end processing flow.