ios swift multithreading grand-central-dispatch

Unexpected Result with GCD Priorities

I've been experimenting with GCD priorities recently. Here's the snippet of code that I've been working with.

for _ in 1...1000 {
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)) {
        for _ in 1...10000000 {
            let _ = sin(0.64739812)
        }
        print("Finished a default")
        dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0))     {
            for _ in 1...10000 {
                let _ = sin(0.64739812)
            }
            print("Finished a high")
        }
    }
}

I expected it to print

Finished a default
Finished a high
// Repeat default and high alternating back and forth 1000 times (because there are 1000 loops)

But what actually happened was the logs printed

Finished a default
Finished a high
Finished a default x 21
Finished a high
Finished a default
Finished a high x 20
Finished a default x 977
Finished a high x 978

It makes sense in the beginning, alternating a little bit. Even 21 defaults in a row makes some sense. But then it does 977 default blocks without processing a single high block. I assume this is happening because the dispatcher is very busy dealing with everything else going on. But still, it's a high priority queue vs a default priority queue.

Does anybody have any insights as to what's going on?

Edit 1

for _ in 1...1000 {
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)) {
        print("Starting a default")
        for i in 1...10000000 {
            let _ = sin(Double(i))
        }
        print("Finished a default")
    }
}
for _ in 1...1000 {
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0)) {
        print("Starting a high")
        for i in 1...10000000 {
            let _ = sin(Double(i))
        }
        print("Finished a high")
    }
}

print("Done Dispatching Everything")

Here I would expect a couple defaults and a couple highs to execute before printing Done Dispatching Everything, and then to execute all the highs then all the defaults.

However, here are the results:

Starting a default x6
Done Dispatching Everything // at this point, all the high and default blocks have been successfully submitted for execution.
Starting a high
Finished a default
Starting a default
Finished a default
Starting a default
Finished a default
Starting a default
Finished a default
Starting a default
Finished a default
Starting a default
Finished a default
Starting a default
Finished a default
Starting a default
Finished a high
Starting a high
Finished a default
Starting a default
Finished a default
Finished a default
Starting a default
Starting a default
Finished a default
Starting a default
Finished a default
Starting a default
Finished a default
Starting a default
Finished a high
Starting a high
Finished a default
Starting a default
Finished a default
Starting a default
// A sequence that looks like the above for around 1500 lines.
Started+Finished a high x ~500

So what's happening is that even after everything is scheduled, default is happening significantly more than high. And then after all the defaults have finished, the highs finally start to execute and finish in bulk.

Edit 2

Another block

for _ in 1...1000 {
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0)) {
        print("Starting a high")
        for i in 1...10000000 {
            let _ = sin(Double(i))
        }
        print("Finished a high")
    }
}
for _ in 1...1000 {
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)) {
        print("Starting a default")
        for i in 1...10000000 {
            let _ = sin(Double(i))
        }
        print("Finished a default")
    }
}

print("Done Dispatching Everything")

And the results blow my mind. It does the exact same thing as my second example, (Edit 1). Even though the highs all get scheduled before the defaults, it still executes the default blocks first!

Edit 3

Last example, I promise

for _ in 1...1000 {
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0)) {
        print("Starting a high")
        for i in 1...10000000 {
            let _ = sin(Double(i))
        }
        print("Finished a high")
    }
}
for _ in 1...1000 {
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0)) {
        print("Starting a background")
        for i in 1...10000000 {
            let _ = sin(Double(i))
        }
        print("Finished a background")
    }
}

print("Done Dispatching Everything")

This executes exactly as expected. All the highs run, then all the backgrounds run, with no exceptions at all. However this is significantly different in execution than edit 2, however in theory should be the exact same.

Solution

The GCD documentation currently suggests that the default priority/QoS is fixed between userInteractive and utility. But it is not that simple: There are situations in which default is affected by the QoS of the caller (with the intent likely being to avoid accidental priority inversions, though these inversions really only happen if a caller is waiting for other work, not when dispatching asynchronously).

The GCD documentation used to have references to how default QoS/priority could be affected by the caller’s context, but this discussion no longer appears to be present in the current API documentation.

But in Energy Efficiency Guide for iOS Apps, Apple explicitly tells us that default QoS “is not intended to be used by developers to classify work.” If you want to prioritize work, stick with userInteractive, userInitiated, utility, and background.

Consider the following (where I have replaced the print statements with Instruments’ “points of interest” intervals, so we can graphically see what is going on when we profile the app):

import os.log

let signposter = OSSignposter(subsystem: "Experiment", category: .pointsOfInterest)

and

DispatchQueue.global(qos: .userInteractive).async {
    for i in 1...200 {
        DispatchQueue.global(qos: .default).async {
            signposter.withIntervalSignpost("1 default", id: signposter.makeSignpostID(), "\(i)") {
                for j in 1...100_000_000 {
                    // calculate unique result for ever iteration and check the result
                    // to ensure the compiler didn’t optimize out the calculation.

                    let result = sin(Double(j))
                    if result == 0 {
                        print("well, that was unexpected")
                    }
                }
            }
        }
    }
    for i in 1...200 {
        DispatchQueue.global(qos: .userInitiated).async {
            signposter.withIntervalSignpost("2 userInitiated", id: signposter.makeSignpostID(), "\(i)") {
                for j in 1...100_000_000 {
                    let result = sin(Double(j))
                    if result == 0 {
                        print("well, that was unexpected")
                    }
                }
            }
        }
    }
}

That yields the following, where userInitiated does not preempt the default work:

However, if you replace that outer dispatch with .utility …

DispatchQueue.global(qos: .utility).async {
    … // same as above
}

… then the behavior changes, with userInitiated preempting the default work:

Bottom line, the behavior of default QoS can be affected by the QoS of the calling context.

A few additional interesting observations:

The default QoS “is not intended to be used by developers to classify work.” See the Energy Efficiency Guide for iOS Apps, which says:

Special Quality of Service Classes

In addition to the primary QoS classes, there are two special types of QoS (described in Table 4-2). In most cases, you won’t be exposed to these classes, but there is still value in knowing they exist.

Table 4-2
Special QoS classes

QoS Class Description

Default The priority level of this QoS falls between user-initiated and utility. This QoS is not intended to be used by developers to classify work. Work that has no QoS information assigned is treated as default, and the GCD global queue runs at this level.

Unspecified This represents the absence of QoS information and cues the system that an environmental QoS should be inferred. Threads can have an unspecified QoS if they use legacy APIs that may opt the thread out of QoS.

QoS Class	Description
Default	The priority level of this QoS falls between user-initiated and utility. This QoS is not intended to be used by developers to classify work. Work that has no QoS information assigned is treated as default, and the GCD global queue runs at this level.
Unspecified	This represents the absence of QoS information and cues the system that an environmental QoS should be inferred. Threads can have an unspecified QoS if they use legacy APIs that may opt the thread out of QoS.

If you change the order, dispatching to userInteractive first, and then dispatching to default, you do not see the default QoS tasks preempting the userInteractive work:

DispatchQueue.global(qos: .userInteractive).async {
    for i in 1...200 {
        DispatchQueue.global(qos: .userInitiated).async {
            signposter.withIntervalSignpost("1 userInitiated", id: signposter.makeSignpostID(), "\(i)") {
                …
            }
        }
    }
    for i in 1...200 {
        DispatchQueue.global(qos: .default).async {
            signposter.withIntervalSignpost("2 default", id: signposter.makeSignpostID(), "\(i)") {
                …
            }
        }
    }
}

So, it is not as simple as “default = currentQoS”. The optimization is more subtle than that.

Consider the scenario where you do these two sets of dispatches in separate calls (where the latter group does not have to wait for the first batch to have been dispatched):

DispatchQueue.global(qos: .userInteractive).async {
    for i in 1...500 {
        DispatchQueue.global(qos: .default).async {
            signposter.withIntervalSignpost("1 default", id: signposter.makeSignpostID(), "\(i)") {
                …
            }
        }
    }
}

DispatchQueue.global(qos: .userInteractive).async {
    for i in 1...500 {
        DispatchQueue.global(qos: .userInitiated).async {
            signposter.withIntervalSignpost("2 userInitiated", id: signposter.makeSignpostID(), "\(i)") {
                …
            }
        }
    }
}

In this case, you see the tasks interspersed between each other, with no appreciable preference for one over the other:

Yes, it favors the first dispatched block, but you can see if we change the order of these two separate dispatches, doing userInitiated first, you see the same behavior, favoring the first dispatched block:

No discussion about hundreds or thousands of separate dispatches would be complete without noting that this constitutes a “thread explosion”. We would often reach for concurrentPerform, rather than a for loop, to ensure that the number of threads created does not exceed the number of processors on the device:
```
DispatchQueue.global().async {
    DispatchQueue.concurrentPerform(iterations: 500) { i in
        signposter.withIntervalSignpost("1 first batch", id: signposter.makeSignpostID(), "\(i)") {
            …
        }
    }
    DispatchQueue.concurrentPerform(iterations: 500) { i in
        signposter.withIntervalSignpost("2 second batch", id: signposter.makeSignpostID(), "\(i)") {
            …
        }
    }
}
```
That yields:

Now, this offers no illumination on the default QoS/priority behavior (as I am explicitly not starting the second batch until the first batch finishes), but is just an observation that we would generally avoid unbridled dispatching to a queue, and instead use concurrentPerform.