I know that there is no guarantee regarding the order of execution for the threads. But my doubt is when I ran below code,
import threading
def doSomething():
print("Hello ")
d = threading.Thread(target=doSomething, args=())
d.start()
print("done")
Output that is coming is either
Hello done
or this
Hello
done
May be if I try too much then it might give me below as well
done
Hello
But I am not convinced with the first output. Since order can be different but how come both outputs are available in the same line. Does that means that one thread is messing up with other threads working?
This is a classic race condition. I can't personally reproduce it, and it would likely vary by interpreter implementation and the precise configuration applied to stdout
. On Python interpreters without a GIL, there is basically no protection against races, and this behavior is expected to a certain extent. Python interpreters do tend to try to protect you from egregious data corruption due to threading, unlike C/C++, but even if they ensure every byte written ends up actually printed, they usually wouldn't try to make explicit guarantees against interleaving; Hdelolnoe
would be a possible (if fairly unlikely given likely implementations) output when you're making no effort whatsoever to synchronize access to stdout
.
On CPython, the GIL protects you more, and writing a single string to stdout
is more likely to be atomic, but you're not writing a single string. Essentially, the implementation of print
is to write objects one by one to the output file object as it goes, it doesn't batch up to a single string then call write
just once. What this means is that:
print("Hello ") # Implicitly outputs default end argument of '\n' after printing provided args
is roughly equivalent to:
sys.stdout.write("Hello ")
sys.stdout.write("\n")
If the underlying stack of file objects that implements sys.stdout
decides to engage in real I/O in response to the first write
, they'll release the GIL before performing the actual write, allowing the main thread to catch up and potentially grab the GIL before the worker thread is given a chance to write the newline. The main thread then outputs the done
and then the newlines from each print
come out in some unspecified (and irrelevant) order based on further potential races.
Assuming you're on CPython, you could probably fix this by changing the code to this equivalent code using single write
calls:
import threading
import sys
def doSomething():
sys.stdout.write("Hello \n")
d = threading.Thread(target=doSomething) # If it takes no arguments, no need to pass args
d.start()
sys.stdout.write("done\n")
and you'd be back to a race condition that only swaps the order, without interleaving (the language spec wouldn't guarantee a thing, but most reasonable implementations would be atomic for this case). If you want it to work with any guarantees without relying on the quirks of the implementation, you have to synchronize:
import threading
lck = threading.Lock()
def doSomething():
with lck:
print("Hello ")
d = threading.Thread(target=doSomething)
d.start()
with lck:
print("done")