Search code examples
iosswiftmultithreadinggrand-central-dispatchdispatch-queue

How to add numbers using multiple threads?


I'm trying to add the numbers in a range eg. 1 to 50000000. But using a for-loop or reduce(_:_:) is taking too long to calculate the result.

func add(low: Int, high: Int) -> Int {
    return (low...high).reduce(0, +)
}

Is there any way to do it using multiple threads?


Solution

  • Adding a series of integers does not amount to enough work to justify multiple threads. While this admittedly took 28 seconds on a debug build on my computer, in an optimized, release build, the single-threaded approach took milliseconds.

    So, when testing performance, make sure to use an optimized “Release” build in your scheme settings (and/or manually change the optimization settings in your target’s build settings).

    But, let us set this aside for a second and assume that you really were doing a calculation that was complex enough to justify running it on multiple threads. In that case, the simplest approach would be to just dispatch the calculation to another thread, and perhaps dispatch the results back to the main thread:

    func add(low: Int, high: Int, completion: @escaping (Int) -> Void) {
        DispatchQueue.global().async {
            let result = (low...high).reduce(0, +)
    
            DispatchQueue.main.async {
                completion(result)
            }
        }
    }
    

    And you'd use it like so:

    add(low: 0, high: 50_000_000) { result in
        // use `result` here
        self.label.text = "\(result)"
    }
    
    // but not here, because the above runs asynchronously
    

    That will ensure that the main thread is not blocked while the calculation is being done. Again, in this example, adding 50 million integers on a release build may not even require this, but the general idea is to make sure that anything that takes more than a few milliseconds is moved off the main thread.

    Now, if the computation was significantly more complicated, one might use concurrentPerform, which is like a for loop, but each iteration runs in parallel. You might think you could just dispatch each calculation to a concurrent queue using async, but that can easily exhaust the limited number of worker threads (called “thread explosion”, which can lead to locks and/or deadlocks). So we reach for concurrentPerform to perform calculations in parallel, but to constrain the number of concurrent threads to the capabilities of the device in question (namely, how many cores the CPU has).

    Let’s consider this simple attempt to calculate the sum in parallel. This is inefficient, but we’ll refine it later:

    func add(low: Int, high: Int, completion: @escaping (Int) -> Void) {
        DispatchQueue.global().async {
            let lock = NSLock()
            var sum = 0
    
            // the `concurrentPerform` below is equivalent to 
            //
            // for iteration in 0 ... (high - low) { ... }
            //
            // but the iterations run in parallel
    
            DispatchQueue.concurrentPerform(iterations: high - low + 1) { iteration in
                // do some calculation in parallel
    
                let value = iteration + low 
    
                // synchronize the update of the shared resource
    
                lock.synchronized {
                    sum += value
                }
            }
    
            // call completion handler with the result
    
            DispatchQueue.main.async {
                completion(sum)
            }
        }
    }
    

    Note, because we have multiple threads adding values, we must synchronize the interaction with sum to ensure thread-safety. In this case, I'm using NSLock and this routine (because introducing a GCD serial queue and/or using reader-writer in these massively parallelized scenarios is even slower):

    extension NSLocking {
        func synchronized<T>(block: () throws -> T) rethrows -> T {
            lock()
            defer { unlock() }
            return try block()
        }
    }
    

    Above, I wanted to show the above simple use of concurrentPerform but you are going to find that that is much slower than the single threaded implementation. That is because there is not enough work running on each thread and we’ll do 50m synchronizations. So we might, instead, “stride” adding a million values per thread:

    func add(low: Int, high: Int, completion: @escaping (Int) -> Void) {
        DispatchQueue.global().async {
            let stride = 1_000_000
            let iterations = (high - low) / stride + 1
    
            let lock = NSLock()
            var sum = 0
    
            DispatchQueue.concurrentPerform(iterations: iterations) { iteration in
                let start = iteration * stride + low
                let end = min(start + stride - 1, high)
                let subtotal = (start...end).reduce(0, +)
                lock.synchronized {
                    sum += subtotal
                }
            }
    
            DispatchQueue.main.async {
                completion(sum)
            }
        }
    }
    

    So, each thread adds up to 1 million values in a local subtotal and then when that calculation is done, it synchronizes the update of sum. This increases the work per thread and dramatically reduces the number of synchronizations. Frankly, adding a million integers is still no where near enough to justify the multithreading overhead, but it illustrates the idea.


    If you want to see an example where concurrentPerform might be useful, consider this example, where we are calculating the Mandelbrot set, where each pixel of the calculation might be computationally intense. And we again stride (e.g. each iteration calculates a row of pixels), which (a) ensures that each thread is doing enough work to justify the multithreading overhead, and (b) avoid memory contention issues (a.k.a. “cache sloshing”).