Search code examples
iosmultithreadingswiftgrand-central-dispatchdispatch-async

Use dispatch_async to analyze an array concurrently in Swift


I am trying to analyze a photo concurrently using a background thread from GCD. Here is the code I have written:

dispatch_async(dispatch_get_global_queue(Int(QOS_CLASS_UTILITY.value), 0)) {
    for (var i = 0; i < 8; i++)
    {
        let color = self.photoAnalyzer.analyzeColors(imageStrips[i])
        colorList.append(color)
    }
}

For clarification on the variable names, here are their descriptions:

photoAnalyzer is an instance of a class I wrote called Analyzer that holds all of the methods to process the image.

analyzeColors is a method inside the Analyzer class that does the majority of the analysis and returns a string with the dominant color of the passed in image

imageStrips is an array of UIImage's that make up the portions of the original image

colorList is an array of strings that stores the return values of the analyzeColor method for each portion of the image.

The above code runs sequentially since the for loop only accesses one image from the imageList at a time. What I am trying to do is analyze each image in imageStrips concurrently, but I had no idea how to go about doing that.

Any suggestions would be greatly appreciated. And if you would like to see all of the code to further help me I can post a GitHub link to it.

EDIT This is my updated code to handle 8 processors concurrently.

dispatch_apply(8, imageQueue) { numStrips -> Void in
    let color = self.photoAnalyzer.analyzeColors(imageStrips[numStrips])
    colorList.append(color)
}

However, if I try to use more than 8 the code actually runs slower than it does sequentially.


Solution

  • There are a couple of ways of doing this, but there are a couple of observations before we get to that:

    • To try to maximize performance, if you do any concurrent processing, be aware that you are not guaranteed the order in which they will complete. Thus a simple colorList.append(color) pattern won't work if the order that they appear is important. You can either prepopulate a colorList and then have each iteration simply do colorList[i] = color or you could use a dictionary. (Obviously, if order is not important, then this is not critical.)

    • Because these iterations will be running concurrently, you'll need to synchronize your updating of colorList. So do your expensive analyzeColors concurrently on background queue, but use a serial queue for the updating of colorList, to ensure you don't have multiple updates stepping over each other.

    • When doing concurrent processing, there are points of diminishing returns. For example, taking a complex task and breaking it into 2-4 concurrent loops might yield some performance benefit, but if you start increasing the number of concurrent threads too much, you'll find that the overhead of these threads starts to adversely affect performance. So benchmark this with different degrees of concurrency and don't assume that "more threads" is always better.

    In terms of how to achieve this, there are two basic techniques:

    1. If you see Performing Loop Iterations Concurrently in the Concurrency Programming Guide: Dispatch Queues guide, they talk about dispatch_apply, which is designed precisely for this purpose, to run for loops concurrently.

      colorList = [Int](count: 8, repeatedValue: 0)  // I don't know what type this `colorList` array is, so initialize this with whatever type makes sense for your app
      
      let queue = dispatch_get_global_queue(QOS_CLASS_UTILITY, 0)
      
      let qos_attr = dispatch_queue_attr_make_with_qos_class(DISPATCH_QUEUE_SERIAL, QOS_CLASS_UTILITY, 0)
      let syncQueue = dispatch_queue_create("com.domain.app.sync", qos_attr)
      
      dispatch_apply(8, queue) { iteration in
          let color = self.photoAnalyzer.analyzeColors(imageStrips[iteration])
          dispatch_sync(syncQueue) {
              colorList[iteration] = color
              return
          }
      }
      
      // you can use `colorList` here
      

      Note, while these iterations run concurrently, the whole dispatch_apply loop runs synchronously with respect to the queue from which you initiated it. This means that you will not want to call the above code from the main thread (we never want to block the main thread). So will likely want to dispatch this whole thing to some background queue.

      By the way, dispatch_apply is discussed in WWDC 2011 video Blocks and Grand Central Dispatch in Practice.

    2. Another common pattern is to create a dispatch group, dispatch the tasks to a concurrent queue using that group, and specify a dispatch_group_notify to specify what you want to do when it's done.

      colorList = [Int](count: 8, repeatedValue: 0)  // I don't know what type this `colorList` array is, so initialize this with whatever type makes sense for your app
      
      let group = dispatch_group_create()
      let queue = dispatch_get_global_queue(QOS_CLASS_UTILITY, 0)
      
      let qos_attr = dispatch_queue_attr_make_with_qos_class(DISPATCH_QUEUE_SERIAL, QOS_CLASS_UTILITY, 0)
      let syncQueue = dispatch_queue_create("com.domain.app.sync", qos_attr)
      
      for i in 0 ..< 8 {
          dispatch_group_async(group, queue) {
              let color = self.photoAnalyzer.analyzeColors(imageStrips[i])
              dispatch_sync(syncQueue) {
                  colorList[i] = color
                  return
              }
          }
      }
      
      dispatch_group_notify(group, dispatch_get_main_queue()) {
          // use `colorList` here
      }
      
      // but not here (because the above code is running asynchronously)
      

      This approach avoids blocking the main thread altogether, though you have to be careful to not add too many concurrent dispatched tasks (as the worker threads are a very limited resource).

    In both of these examples, I created a dedicated serial queue for synchronizing the updates to colorList. That may be overkill. If you're not blocking the main queue (which you shouldn't do anyway), you could dispatch this synchronization code to the main queue (which is a serial queue). But it's probably more precise to have a dedicated serial queue for this purpose. And if this was something that I was going to be interacting with from multiple threads constantly, I'd use a reader-writer pattern. But this is probably good enough for this situation.