Search code examples
mathlanguage-agnosticprogramming-languagesmultiprocessing

Programming Logic - Splitting up Tasks Between Threads


Lets say you want 5 threads to process data simultaneous. Also assume, you have 89 tasks to process.

Off the bat you know 89 / 5 = 17 with a remainder of 4. The best way to split up tasks would be to have 4 (the remainder) threads process 18 (17+1) tasks each and then have 1 (# threads - remainder) thread to process 17.

This will eliminate the remainder. Just to verify:

Thread 1: Tasks  1-18  (18 tasks)
Thread 2: Tasks 19-36  (18 tasks)
Thread 3: Tasks 37-54  (18 tasks)
Thread 4: Tasks 55-72  (18 tasks)
Thread 5: Tasks 73-89  (17 tasks)

Giving you a total of 89 tasks completed.

I need a way of getting the start and ending range of each thread mathematically/programmability; where the following should print the exact thing I have listed above:

$NumTasks = 89
$NumThreads = 5
$Remainder = $NumTasks % $NumThreads 
$DefaultNumTasksAssigned = floor($NumTasks / $NumThreads)

For $i = 1 To $NumThreads
    if $i <= $Remainder Then
        $NumTasksAssigned = $DefaultNumTasksAssigned + 1
    else
        $NumTasksAssigned = $DefaultNumTasksAssigned
    endif
    $Start = ??????????
    $End = ??????????
    print Thread $i: Tasks $Start-$End ($NumTasksAssigned tasks)
Next

This should also work for any number of $NumTasks.

Note: Please stick to answering the math at hand and avoid suggesting or assuming the situation.


Solution

  • I second Will Hartung 's remark. You may just feed them one task at a time (or a few tasks at a at a time, depending if there's much overhead, i.e. if individual tasks get typically completed very fast, relative to the cost of starting/recycling threads). Your subsequent comments effectively explain that your "threads" carry a heavy creation cost, and hence your desire to feed them once with as much work as possible, rather than wasting time creating new "thread" each fed a small amount of work.

    Anyway... going to the math question...

    If you'd like to assign tasks just once, the following formula, plugged in lieu of the the ????????? in your logic, should do the trick:

    $Start = 1 
             + (($i -1) * ($DefaultNumTasksAssigned + 1) 
            - (floor($i / ($Remainder + 1)) * ($i - $Remainder))
    $End = $Start + $NumTasksAssigned -1
    

    The formula is explained as follow:
        1 is for the fact that your display / logic is one-based not zero-based
        The second term is because we generally add ($DefaultNumTasksAssigned + 1) with each iteration.
        The third term provides a correction for the last few iterations.
          Its first part, (floor($i / ($Remainder + 1)) provides 0 until $i reaches the first thread
          that doesn't receive one extra task, and 1 thereafter.
          The second part express by how much we need to correct.

    The formula for $End is easier, the only trick is the minus 1, it is because the Start and End values are inclusive (so for example between 1 an 19 there are 19 tasks not 18)

    The following slightly modified piece of logic should also work, it avoids the "fancy" formula by keeping a running tab of the $Start variable, rather than recomputing it each time..

    $NumTasks = 89
    $NumThreads = 5
    $Remainder = $NumTasks % $NumThreads 
    $DefaultNumTasksAssigned = floor($NumTasks / $NumThreads)
    $Start = 1
    For $i = 1 To $NumThreads
        if $i <= $Remainder Then    // fixed here!  need <= because $i is one-based
            $NumTasksAssigned = $DefaultNumTasksAssigned + 1
        else
            $NumTasksAssigned = $DefaultNumTasksAssigned
        endif
        $End = $Start + $NumTasksAssigned -1
        print Thread $i: Tasks $Start-$End ($NumTasksAssigned tasks)
    
        $Start = $Start + $NumTasksAssigned
    Next
    

    Here's a Python transcription of the above

    >>> def ShowWorkAllocation(NumTasks, NumThreads):
    ...   Remainder = NumTasks % NumThreads
    ...   DefaultNumTasksAssigned = math.floor(NumTasks / NumThreads)
    ...   Start = 1
    ...   for i in range(1, NumThreads + 1):
    ...     if i <= Remainder:
    ...        NumTasksAssigned = DefaultNumTasksAssigned + 1
    ...     else:
    ...        NumTasksAssigned = DefaultNumTasksAssigned
    ...     End = Start + NumTasksAssigned - 1
    ...     print("Thread ", i, ": Tasks ", Start, "-", End, "(", NumTasksAssigned,")")
    ...     Start = Start + NumTasksAssigned
    ...
    >>>
    >>> ShowWorkAllocation(89, 5)
    Thread  1 : Tasks  1 - 18 ( 18 )
    Thread  2 : Tasks  19 - 36 ( 18 )
    Thread  3 : Tasks  37 - 54 ( 18 )
    Thread  4 : Tasks  55 - 72 ( 18 )
    Thread  5 : Tasks  73 - 89 ( 17 )
    
    >>> ShowWorkAllocation(11, 5)
    Thread  1 : Tasks  1 - 3 ( 3 )
    Thread  2 : Tasks  4 - 5 ( 2 )
    Thread  3 : Tasks  6 - 7 ( 2 )
    Thread  4 : Tasks  8 - 9 ( 2 )
    Thread  5 : Tasks  10 - 11 ( 2 )
    >>>
    
    >>> ShowWorkAllocation(89, 11)
    Thread  1 : Tasks  1 - 9 ( 9 )
    Thread  2 : Tasks  10 - 17 ( 8 )
    Thread  3 : Tasks  18 - 25 ( 8 )
    Thread  4 : Tasks  26 - 33 ( 8 )
    Thread  5 : Tasks  34 - 41 ( 8 )
    Thread  6 : Tasks  42 - 49 ( 8 )
    Thread  7 : Tasks  50 - 57 ( 8 )
    Thread  8 : Tasks  58 - 65 ( 8 )
    Thread  9 : Tasks  66 - 73 ( 8 )
    Thread  10 : Tasks  74 - 81 ( 8 )
    Thread  11 : Tasks  82 - 89 ( 8 )
    >>>