Python v3.10
Unexpected behavior:
import threading
lock = threading.RLock()
def th(name):
print( f"{name} tread started")
lock.acquire()
print( f"{name} tread end")
th1 = threading.Thread(target=th, args=[1])
th2 = threading.Thread(target=th, args=[2])
th3 = threading.Thread(target=th, args=[3])
th1.start()
th2.start()
th3.start()
Output ->
1 tread started
1 tread end
2 tread started
2 tread end
3 tread started
3 tread end
We can clearly see that all 3 threads unlocks the RLock (sometimes 2 sometimes 3)
Expected behavior:
import threading
import time
lock = threading.RLock()
def th(name):
print( f"{name} tread started")
lock.acquire()
time.sleep(0.1) # simulating some work
print( f"{name} tread end")
th1 = threading.Thread(target=th, args=[1])
th2 = threading.Thread(target=th, args=[2])
th3 = threading.Thread(target=th, args=[3])
th1.start()
th2.start()
th3.start()
Output ->
1 tread started
2 tread started
3 tread started
1 tread end
When there's some work the RLock does its thing (acquired by thread1 and block thread2 and thread3 untill thread1 releases the RLock) I tired this with loops too, but it seems when there's no or very little work in threads the RLock acquired by multiple threads
I think an answer by @pmod currently marked as a "solution" is not accurate in multiple ways, so I want to attempt correcting it. Experiment conducted there also does not prove the theory, but works coincidentally.
lock is released when thread is destroyed
It is not the case and can be demonstrated by the following experiment.
from threading import Thread, RLock, get_ident
import time
import datetime
lock = RLock()
def print2(msg):
print('[%s][%s] %s' % (datetime.datetime.now(), get_ident(), msg))
def work():
print2('start')
time.sleep(1)
lock.acquire()
print2('done')
if __name__ == '__main__':
t1 = Thread(target=work)
t1.start()
t2 = Thread(target=work)
t2.start()
t1.join()
t2.join()
The output will be something like below and the program will never finish.
[2024-08-29 21:41:20.799172][127474860033600] start
[2024-08-29 21:41:20.802444][127474849547840] start
[2024-08-29 21:41:21.802138][127474860033600] done
It means that one of the thread has finished, however second thread did not take ownership of the RLock
.
Now to what is really happening in the original snippet. If we take a closer look to how RLock
is implemented we will be able to answer that question.
https://github.com/python/cpython/blob/3.12/Lib/threading.py#L137
In the source code we see that in order to track (record and then be able to check) owner of the lock RLock
uses threading.get_ident()
function. Given RLock
implements "reentrant" lock semantic, thread on which get_ident
happen to return same value as it was used when lock was first acquired will be allowed to pass subsequent acquire
invocation. It will be considered a recursive acquire
and recursion counter will be increased.
Let's check the documentation and see what threading.get_ident()
actually is.
Return the ‘thread identifier’ of the current thread. This is a nonzero integer. Its value has no direct meaning; it is intended as a magic cookie to be used e.g. to index a dictionary of thread-specific data. Thread identifiers may be recycled when a thread exits and another thread is created.
The last sentence is crucial here. Basically get_ident
returns some identifier of the Python logical thread which can be used to identify thread among other alive thread. However, as soon as thread completes its work the identifier can be recycled and reused. For RLock
it means that if Thread
that acquired the lock happens to finish w/o releasing the lock chances are its assigned thread identity will be reused by another thread, and so subsequent acquire
calls from such threads will be granted.
Here is a test with some useful output:
from threading import Thread, RLock, get_ident, get_native_id
import datetime
lock = RLock()
def print2(msg):
print('[%s][%s][%s] %s' % (
datetime.datetime.now(), get_ident(), get_native_id(), msg))
def work():
print2('start')
lock.acquire()
print2('done, %s' % lock)
if __name__ == '__main__':
threads = []
for idx in range(5):
t = Thread(target=work)
t.start()
threads.append(t)
print2('waiting for threads to join...')
for t in threads:
t.join()
[2024-08-29 22:01:13.019496][126592235865664][10515] start
[2024-08-29 22:01:13.019580][126592235865664][10515] done, <locked _thread.RLock object owner=126592235865664 count=1 at 0x73228dd489c0>
[2024-08-29 22:01:13.020065][126592225379904][10516] start
[2024-08-29 22:01:13.020381][126592235865664][10517] start
[2024-08-29 22:01:13.020394][126592235865664][10517] done, <locked _thread.RLock object owner=126592235865664 count=2 at 0x73228dd489c0>
[2024-08-29 22:01:13.020672][126592235865664][10518] start
[2024-08-29 22:01:13.020684][126592235865664][10518] done, <locked _thread.RLock object owner=126592235865664 count=3 at 0x73228dd489c0>
[2024-08-29 22:01:13.020857][126592235865664][10519] start
[2024-08-29 22:01:13.020869][126592235865664][10519] done, <locked _thread.RLock object owner=126592235865664 count=4 at 0x73228dd489c0>
[2024-08-29 22:01:13.020904][126592257781760][10514] waiting for threads to join...
Notice how every time we have our RLock granted we see the same get_ident
result and also we see RLock
recursion count growing.
This is also quite platform specific, for instance, on Windows the same test does not reproduce the behavior as get_ident
does not immediately return the values used by the finished thread. For example, here is output from Windows environment (notice also that get_ident
and get_native_id
are the same):
[2024-08-29 22:04:32.838318][10164][10164] start
[2024-08-29 22:04:32.839277][10164][10164] done, <locked _thread.RLock object owner=10164 count=1 at 0x000002B332E63DC0>
[2024-08-29 22:04:32.839277][12608][12608] start
[2024-08-29 22:04:32.839277][20084][20084] start
[2024-08-29 22:04:32.839277][20688][20688] start
[2024-08-29 22:04:32.839277][18112][18112] start
[2024-08-29 22:04:32.839277][15948][15948] waiting for threads to join...