I'm using https://github.com/tidwall/Safe , a Swift concurrency library, and I think I've found a threading bug. (I'm using IOS 12.3.1, iPhone Xs. Swift 4, I think; Xcode 10.2 .) The library is read-only, now, so I'm trying to debug it myself. The bug is really subtle, though, or it's caused by something I haven't even imagined, because I do almost the same thing as the given library and it works fine, but the library itself deadlocks.
Here's the test code that deadlocks when it shouldn't:
private func testCompetingDeadlock() {
NSLog("start")
let c = Chan<Int32>()
let b = Chan<Int32>()
let COUNT = 1000
let wg = WaitGroup()
wg.add(1)
dispatch {
NSLog("receiver starting")
for i in 0..<(2*COUNT) {
Thread.sleep(forTimeInterval: 0.01)
let v = <-c
b <- v!
}
wg.done()
}
sleep(1)
wg.add(1)
dispatch {
NSLog("sender 1 starting")
for i in 0..<COUNT {
c <- 1
<-b
NSLog("1 : \(i)")
}
NSLog("1 done")
wg.done()
}
wg.add(1)
dispatch {
NSLog("sender 2 starting")
for i in 0..<COUNT {
c <- 2
<-b
NSLog("2 : \(i)")
}
NSLog("2 done")
wg.done()
}
wg.wait()
NSLog("Both done")
}
Note that the underlying implementation of send
, aka <-
, is
internal func send(_ msg: T) {
NSLog("locking (\(Thread.current)) - \(Unmanaged.passUnretained(cond.mutex).toOpaque())")
cond.mutex.lock()
NSLog("locked (\(Thread.current)) - \(Unmanaged.passUnretained(cond.mutex).toOpaque())")
threadCount += 1
defer {
threadCount -= 1
NSLog("unlocking (\(Thread.current)) - \(Unmanaged.passUnretained(cond.mutex).toOpaque())")
cond.mutex.unlock()
NSLog("unlocked (\(Thread.current)) - \(Unmanaged.passUnretained(cond.mutex).toOpaque())")
}
if threadCount > 1 {
NSLog("threadCount is \(threadCount)")
}
if closed {
#if os(Linux)
assertionFailure("Send on closed channel")
#else
NSException.raise(NSExceptionName(rawValue: "Exception"), format: "send on closed channel", arguments: getVaList([]))
#endif
}
msgs.append(msg)
broadcast()
while msgs.count > cap {
cond.wait()
}
}
(I added the logging, and threadCount
. Once deadlock occurs, threadCount
is 2. I tried the same "inc after lock, dec before unlock" in the Mutex
class, and I get 3 during deadlock??? I don't know how, and I haven't investigated it further, though it might be an important clue.)
If testCompetingDeadlock
is run, deadlock usually occurs immediately, with the two sending threads stuck on that cond.wait()
line of send
, both inside the locked zone of the same mutex. I don't know how. I tried testing the Mutex
itself, in the same way I perceive send
to use it, as follows:
private func testSafeMutex() {
let mutex = Mutex()
dispatch {
NSLog("1 locking")
mutex.lock()
NSLog("1 locked")
defer {
NSLog("1 unlocking")
mutex.unlock()
NSLog("1 unlocked")
}
sleep(1)
}
dispatch {
NSLog("2 locking")
mutex.lock()
NSLog("2 locked")
defer {
NSLog("2 unlocking")
mutex.unlock()
NSLog("2 unlocked")
}
sleep(1)
}
}
However, works fine - no deadlocks.
I'm not really sure what to do, beyond just adding finer and finer grained logging, and trying to merge the two test cases until the crucial difference is found (which would be difficult, as it's hard to keep the code functional inbetween versions). Can anybody help me debug this library? Is there perhaps some iOS-specific memory-model issue, etc?
I eventually figured it out - I forgot that the cond.wait()
at the end releases the lock, which explains why I had multiple threads inside a supposedly locked section. Having figured that out, I figured out the true problem and came up with a fix: https://github.com/Erhannis/Safe/commit/cfa41231d01895457bfc1421d779a29a18923c5b
The true problem, as I understand it, was basically that the conditions for exiting the while
loop were wrong - all sending threads would wait until the buffer had room, but the ones whose messages had been read SHOULD have exited the loop once their message had been read.