Kitchen Service example in WWDC Swift Concurrency Session is confusing

WWDC session Beyond the basics of structured concurrency

func run() async throws {
    try await withThrowingTaskGroup(of: Void.self) { group in
        for cook in staff.keys {
            group.addTask { try await cook.handleShift() }
        }

        group.addTask {
            // keep the restaurant going until closing time
            try await Task.sleep(for: shiftDuration)
        }

        try await group.next()
        // cancel all ongoing shifts
        group.cancelAll()
    }
}

The instructor explained once the cooks are working we start a timer. When the timer finishes we cancel all ongoing shifts. But await group.next() only awaited on the next task, which may not be the timer. if one cook gets off work, the group also cancel all tasks. Did I miss something or this example is wrong?

Solution

Consider the example you included in your question:

func run() async throws {
    try await withThrowingTaskGroup(of: Void.self) { group in
        for cook in staff.keys {
            group.addTask { try await cook.handleShift() }
        }

        group.addTask {
            // keep the restaurant going until closing time
            try await Task.sleep(for: shiftDuration)
        }

        try await group.next()
        // cancel all ongoing shifts
        group.cancelAll()
    }
}

Your interpretation is correct, that run will finish as soon as any one of the group’s tasks finish first, whether it is the “end of shift” timeout task, or any one (!) of the handleShift calls. It only makes sense if, you assume that none of the handleShift calls will finish before shiftDuration elapses. But it is imprudent to make any such assumptions. (Besides, no real business shuts their offices for the day as soon as one employee runs out of work. Lol.)

This is just an extraordinarily bad example, IMHO, and I wouldn’t lose sleep over it. Your assessment is correct.

If I wanted run to (a) let all the cooks finish their work; and (b) cancel any unfinished work at the end of the shift, I would personally would not include the “end of shift” as part of the task group. Often, this “timeout” pattern would be employed with unstructured concurrency, with one task for the main task (i.e., all the cooks’ work), and another for the “timeout” (i.e., the end of the shift). And because we’re using unstructured concurrency, we would manually propagate cancelation with withTaskCancellationHandler:

func run() async throws {
    // Unstructured concurrency for all the cooks

    let workTask = Task {
        try await withThrowingTaskGroup(of: Void.self) { group in
            for cook in staff.keys {
                group.addTask { try await cook.handleShift() }
            }

            try await group.waitForAll()
            print("all work done")
        }
    }

    // Unstructured concurrency for the end of the shift

    let cancelTask = Task {
        try await Task.sleep(for: shiftDuration) // keep the restaurant going until closing time
        print("ending shift")
        workTask.cancel()                        // at the end of the shift, tell the cooks to stop working
    }

    // And because we're using unstructured concurrency, we need to manually handle
    // cancelation of `run`, itself.

    try await withTaskCancellationHandler {
        _ = try await workTask.value             // perform all the work of `handleShift` for all the cooks
        cancelTask.cancel()                      // if all the work finished, then `cancelTask` is no longer needed and should be canceled, itself; obviously, if the timeout `Task` ended up getting invoked (i.e., work was still in progress at the end of the shift), then we will not reach this line because `workTask` will throw `CancellationError`
    } onCancel: {
        cancelTask.cancel()                      // if `run` was canceled, though (e.g., there was a fire and the restaurant needed to be closed prematurely; lol), then just clean up
        workTask.cancel()
    }
}

Or you could remain within structured concurrency, using their pattern, but wrap all the cooks’ work in a group-within-a-group:

func run() async throws {
    try await withThrowingTaskGroup(of: Void.self) { [staff, shiftDuration] group in
        group.addTask {
            try await withThrowingDiscardingTaskGroup { group in // or `withThrowingTaskGroup(of: Void.self)` in older versions
                for cook in staff.keys {
                    group.addTask { try await cook.handleShift() }
                }
            }
            print("all work done")
        }

        group.addTask {
            try await Task.sleep(for: shiftDuration) // keep the restaurant going until closing time
            print("ending shift")
        }

        try await group.next()
        group.cancelAll()
    }
}

Both of these are a little unwieldy (especially for a WWDC audience), so I can understand why Apple over-simplified their example (but at the cost of correctness for the business problem they were allegedly trying to solve). I suspect they were just trying to illustrate that you can cancel a task group and did not want to get into the weeds of something like this. But these two revised snippets are examples of the common “timeout” pattern. And this would probably be a more correct implementation of the sort of business problem they allegedly were trying to illustrate.