I was wondering about the equivalent of Java's "volatile", and found this answer.
An equivalent to Java volatile in Python
Which (basically) says that everything is effectively volatile in python, at least in cpython, because of the GIL. Which makes sense, everything is locked by the GIL, no memory barriers to worry about, etc. But I would be happier if this were documented and guaranteed by specification, rather than have it be a result of the way that cpython happens to currently be implemented.
Because, say I want one thread to post data and others to read it, so I can choose something like this:
class XFaster:
def __init__(self):
self._x = 0
def set_x(self, x):
self._x = x
def get_x(self, x):
return self._x
class XSafer:
def __init__(self):
self._x = 0
self._lock = threading.Lock()
def set_x(self, x):
with self._lock:
self._x = x
def get_x(self, x):
with self._lock:
return self._x
I'd rather go with XFaster
or even not use a getter and setter at all. But I also want to do things reliably and "correctly". Is there some official documentation that says this is OK? What about say putting a value in a dict
or appending to a list
?
In other words, is there a systematic, documented way of determining what I can do without a threading.Lock
(without digging through dis
or anything like that)? And also preferably in a way that won't break with a future python release.
On edit: I appreciate the informed discussion in comments. But what I would really want is some specification that guarantees the following:
If I execute something like this:
# in the beginning
x.a == foo
# then two threads start
# thread 1:
x.a = bar
# thread 2
do_something_with(x.a)
I want to be sure that:
x.a
it reads either foo
or bar
bar
Here are some things I want not to happen:
x.a=bar
from thread 1 isn't visible to the thread 2x.__dict__
is in the middle of being re-hashed and so thread 2 reads garbageTLDR: CPython guarantees that its own data structures are thread-safe against corruption. This does not mean that any custom data structures or code are race-free.
The intention of the GIL is to protect CPython's data structures against corruption. One can rely on the internal state being thread-safe.
global interpreter lock (Python documentation – Glossary)
The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. [...]
This also implies correct visibility of changes across threads.
However, this does not mean that any isolated statement or expression is atomic: Almost any statement or expression can invoke more than one bytecode instruction. As such the GIL does explicitly not provide atomicity for these cases.
In specific, a statement such as x.a=bar
may execute arbitrary many bytecode instructions by invoking a setter via object.__setattr__
or the descriptor protocol. It executes at least three bytecode instructions for bar
lookup, x
lookup and a
assignment.
As such, Python guarantees visibility/consistency, but provides no guarantees against race conditions. If an object is mutated concurrently, this must be synchronised for correctness.