The question Safely iterating over WeakKeyDictionary and WeakValueDictionary did not put me at ease as I had hoped, and it's old enough that it's worth asking again rather than commenting.
Suppose I have a class MyHashable
that's hashable, and I want to build a WeakSet
:
obj1 = MyHashable()
obj2 = MyHashable()
obj3 = MyHashable()
obj2.cycle_sibling = obj3
obj3.cycle_sibling = obj2
ws = WeakSet([obj1, obj2, obj3])
Then I delete some local variables, and convert to a list in preparation for a later loop:
del obj2
del obj3
list_remaining = list(ws)
The question I cite seems to claim this is just fine, but even without any kind of explicit for
loop, have I not already risked the cyclic garbage collector kicking in during the constructor of list_remaining
and changing the size of the set? I would expect this problem to be rare enough that it would be difficult to detect experimentally, but could crash my program once in a blue moon.
I don't even feel like the various commenters on that post really came to an agreement whether something like
for obj in list(ws):
...
was ok, but they did all seem to assume that list(ws)
itself can run all the way through without crashing, and I'm not even convinced of that. Does the list
constructor avoid using iterators somehow and thus not care about set size changes? Can garbage collection not occur during a list
constructor because list
is built-in?
For the moment I've written my code to destructively pop
items out of the WeakSet
, thus avoiding iterators altogether. I don't mind doing it destructively because at that point in my code I'm done with the WeakSet
anyway. But I don't know if I'm being paranoid.
The docs are frustratingly lacking in information on this, but looking at the implementation, we can see that WeakSet.__iter__
has a guard against this kind of problem.
During iteration over a WeakSet
, weakref callbacks will add references to a list of pending removals rather than removing references from the underlying set directly. If an element dies before iteration reaches it, the iterator won't yield the element, but you're not going to get a segfault or a RuntimeError: Set changed size during iteration
or anything.
Here's the guard (not threadsafe, despite what the comment says):
class _IterationGuard:
# This context manager registers itself in the current iterators of the
# weak container, such as to delay all removals until the context manager
# exits.
# This technique should be relatively thread-safe (since sets are).
def __init__(self, weakcontainer):
# Don't create cycles
self.weakcontainer = ref(weakcontainer)
def __enter__(self):
w = self.weakcontainer()
if w is not None:
w._iterating.add(self)
return self
def __exit__(self, e, t, b):
w = self.weakcontainer()
if w is not None:
s = w._iterating
s.remove(self)
if not s:
w._commit_removals()
Here's where __iter__
uses the guard:
class WeakSet:
...
def __iter__(self):
with _IterationGuard(self):
for itemref in self.data:
item = itemref()
if item is not None:
# Caveat: the iterator will keep a strong reference to
# `item` until it is resumed or closed.
yield item
And here's where the weakref callback checks the guard:
def _remove(item, selfref=ref(self)):
self = selfref()
if self is not None:
if self._iterating:
self._pending_removals.append(item)
else:
self.data.discard(item)
You can also see the same guard used in WeakKeyDictionary
and WeakValueDictionary
.
On old Python versions (3.0, or 2.6 and earlier), this guard is not present. If you need to support 2.6 or earlier, it looks like it should be safe to use keys
, values
, and items
with the weak dict classes; I list no option for WeakSet because WeakSet didn't exist back then. If there's a safe, non-destructive option on 3.0, I haven't found one, but hopefully no one needs to support 3.0.