Search code examples
cboehm-gc

How does one implement weak references with Boehm GC?


I have a personal project which I implement using the Boehm GC. I need to implement a sort of event type, which should hold references to other events. But I also need to ensure that the events pointed to are still collectable, thus I need weak references for this.

Let's say we have events A, B and C. I configure these events to signal event X whenever any of them is signaled. This means A, B and C must hold a reference to event X. What I want is that if event X is unreachable the events A, B and C don't need to signal it anymore. Thus a weak reference is what I thought of.

Is there any other way to do this? I don't want to change the GC but if necessary (the allocation interface remains clean) I could.

The project is written in C. If need be, I will provide more info. Notably, if there is any way to implement such events directly with this semantics, there's no need for actual weak references (events MAY have a reference cycle though while they are not signaled).


Solution

  • The Boehm GC does not have a concept of weak references per se. However, it does not scan memory allocated by the system malloc for references to managed objects, so pointers stored in such memory do not prevent the pointed-to object from being collected. Of course, that approach means that the objects containing the pointers will not be managed by the collector.

    Alternatively, it should be possible to abuse GC_MALLOC_ATOMIC() or GC_malloc_explicitly_typed() to obtain a managed object that can contain pointers to other managed objects without preventing those other objects from being collected. That involves basically lying to GC about whether some members are pointers, so as to prevent them from being scanned.

    Either way, you also require some mechanism for receiving notice when weakly-referenced objects are collected, so as to avoid attempting to access them afterward. GC has an interface for registering finalizer callbacks to be invoked before an object is collected, and that looks like your best available option for the purpose.

    Overall, I think what you're asking for is doable, but with a lot of DIY involved. At a high level,

    • use GC_MALLOC_ATOMIC() to allocate a wrapper object around a pointer to the weakly referenced object. Allocating it this way allows the wrapper to itself be managed by GC, without the pointer within being scanned during GC's reachability analyses.
    • use GC_register_finalizer to register a finalizer function that sets the wrapper's pointer to NULL when GC decides that the pointed-to object is inaccessible.
    • users of the wrapper are obligated to check whether the pointer within is NULL before attempting to dereference it.