Search code examples
arraysswiftmultithreading

Writing to different Swift array indexes from different threads


I see frequent mention that Swift arrays, due to copy-on-write, are not threadsafe, but have found this works, as it updates different and unique elements in an array from different threads simultaneously:

//pixels is [(UInt8, UInt8, UInt8)]

let q = DispatchQueue(label: "processImage", attributes: .concurrent)
q.sync {

  DispatchQueue.concurrentPerform(iterations: n) { i in
     ... do work ...
     pixels[i] = ... store result ...
  }
}

(simplified version of this function)

If threads never write to the same indexes, does copy-on-write still interfere with this? I'm wondering if this is safe since the array itself is not changing length or memory usage. But it does seem that copy-on-write would prevent the array from staying consistent in such a scenario.

If this is not safe, and since doing parallel computations on images (pixel arrays) or other data stores is a common requirement in parallel computation, what is the best idiom for this? Is it better that each thread have its own array and then they are combined after all threads complete? It seems like additional overhead and the memory juggling from creating and destroying all these arrays doesn't feel right.


Solution

  • Updated answer

    Is it safe to write directly to different indices of an array from different threads?

    No. If we run the following code, the thread sanitizer (see this guide) reports a race condition when writing to values[idx]. However, in my testing, it does work in practice more or less every single time. I ran it in a loop, running thousands upon thousands of times, and had one crash. But this is clearly not what we're meant to do.

    let NUM = 1_000_000
    
    func a() {
        var values = [Int](repeating: 0, count: NUM)
        
        DispatchQueue.concurrentPerform(iterations: NUM) { idx in
            values[idx] = idx // <- not thread safe
        }
    }
    

    However, it does not seem to be related to any copy-on-write mechanics. Since the threads access the array via closure capture, they are all in fact accessing the same array. We can also put the array in a reference type Box, and we still have problems with race conditions. To me, this indicates that it's not the copy-on-write behaviour of arrays that is the root of the problem.

    class Box {
        var values: [Int]
        init(values: [Int]) {
            self.values = values
        }
        
        func update(at index: Int, value: Int) {
            values[index] = value // <- not thread safe
        }
    }
    
    func b() {
        let box = Box(values: [Int](repeating: 0, count: NUM))
        
        DispatchQueue.concurrentPerform(iterations: NUM) { idx in
            box.update(at: idx, value: idx)
        }
    }
    
    b()
    

    If we access the underlying buffer directly via withUnsafeMutableBufferPointer, it should however work correctly. The thread sanitizer doesn't complain, at least.

    func c() {
        var values = [Int](repeating: 0, count: NUM)
        
        values.withUnsafeMutableBufferPointer { buffer in
            DispatchQueue.concurrentPerform(iterations: NUM) { idx in
                buffer[idx] = idx // <- *is* thread safe
            }
        }
    }
    

    The copy on write behaviour of arrays in Swift should be thread safe. The reason this isn't safe is because COW isn't being triggered in the first place.

    The array value references an underlying memory buffer. Before editing the buffer, it checks that the reference count of the buffer is greater than 1. If it is, it copies the buffer to a new memory location and uses that new buffer instead. This is thread-safe since the original buffer will be untouched.

    In the example above, the reference count for the underlying buffer remains at 1, since we don't actually have multiple arrays all referencing the same buffer; we have multiple "references" to the same array, referencing the buffer only once. Since there's only the one array value, no copy on write will happen.