Is it safe to give a python WeakSet to a list constructor?

The question Safely iterating over WeakKeyDictionary and WeakValueDictionary did not put me at ease as I had hoped, and it's old enough that it's worth asking again rather than commenting.

Suppose I have a class MyHashable that's hashable, and I want to build a WeakSet:

obj1 = MyHashable()
obj2 = MyHashable()
obj3 = MyHashable()

obj2.cycle_sibling = obj3
obj3.cycle_sibling = obj2

ws = WeakSet([obj1, obj2, obj3])

Then I delete some local variables, and convert to a list in preparation for a later loop:

del obj2
del obj3

list_remaining = list(ws)

The question I cite seems to claim this is just fine, but even without any kind of explicit for loop, have I not already risked the cyclic garbage collector kicking in during the constructor of list_remaining and changing the size of the set? I would expect this problem to be rare enough that it would be difficult to detect experimentally, but could crash my program once in a blue moon.

I don't even feel like the various commenters on that post really came to an agreement whether something like

for obj in list(ws):
    ...

was ok, but they did all seem to assume that list(ws) itself can run all the way through without crashing, and I'm not even convinced of that. Does the list constructor avoid using iterators somehow and thus not care about set size changes? Can garbage collection not occur during a list constructor because list is built-in?

For the moment I've written my code to destructively pop items out of the WeakSet, thus avoiding iterators altogether. I don't mind doing it destructively because at that point in my code I'm done with the WeakSet anyway. But I don't know if I'm being paranoid.

Solution

The docs are frustratingly lacking in information on this, but looking at the implementation, we can see that WeakSet.__iter__ has a guard against this kind of problem.

During iteration over a WeakSet, weakref callbacks will add references to a list of pending removals rather than removing references from the underlying set directly. If an element dies before iteration reaches it, the iterator won't yield the element, but you're not going to get a segfault or a RuntimeError: Set changed size during iteration or anything.

Here's the guard (not threadsafe, despite what the comment says):

class _IterationGuard:
    # This context manager registers itself in the current iterators of the
    # weak container, such as to delay all removals until the context manager
    # exits.
    # This technique should be relatively thread-safe (since sets are).

    def __init__(self, weakcontainer):
        # Don't create cycles
        self.weakcontainer = ref(weakcontainer)

    def __enter__(self):
        w = self.weakcontainer()
        if w is not None:
            w._iterating.add(self)
        return self

    def __exit__(self, e, t, b):
        w = self.weakcontainer()
        if w is not None:
            s = w._iterating
            s.remove(self)
            if not s:
                w._commit_removals()

Here's where __iter__ uses the guard:

class WeakSet:
    ...
    def __iter__(self):
        with _IterationGuard(self):
            for itemref in self.data:
                item = itemref()
                if item is not None:
                    # Caveat: the iterator will keep a strong reference to
                    # `item` until it is resumed or closed.
                    yield item

And here's where the weakref callback checks the guard:

def _remove(item, selfref=ref(self)):
    self = selfref()
    if self is not None:
        if self._iterating:
            self._pending_removals.append(item)
        else:
            self.data.discard(item)

You can also see the same guard used in WeakKeyDictionary and WeakValueDictionary.

On old Python versions (3.0, or 2.6 and earlier), this guard is not present. If you need to support 2.6 or earlier, it looks like it should be safe to use keys, values, and items with the weak dict classes; I list no option for WeakSet because WeakSet didn't exist back then. If there's a safe, non-destructive option on 3.0, I haven't found one, but hopefully no one needs to support 3.0.