Search code examples
javajava-8weak-referencesfinalizerphantom-reference

Optimal way to use phantom references in Java 8, or use weak references instead?


I am implementing a feature that reports an error when instances of my Java class are discarded before being "used" (for simplicity, we can define being "used" as having a particular method called).

My first idea is to use phantom references, which is often used as an improvement on finalize() methods. I would have a phantom reference class that would point to my main object (the one I want to detect whether it is discarded before being used) as the referrent, something like this:

class MainObject {
    static class MyPhantom extends PhantomReference<MainObject> {
        static Set<MyPhantom> phantomSet = new HashSet<>();
        MyPhantom(MainObject obj, ReferenceQueue<MainObject> queue) {
            super(obj, queue);
            phantomSet.add(this);
        }
        void clear() {
            super.clear();
            phantomSet.remove(this);
        }
    }
    MyPhantom myPhantom = new MyPhantom(this, referenceQueue);
    static ReferenceQueue<MainObject> referenceQueue = new ReferenceQueue<>();
    void markUsed() {
        myPhantom.clear();
        myPhantom = null;
    }
    static void checkDiscarded() { // run periodically
        while ((aPhantom = (MyPhantom) referenceQueue.poll()) != null) {
            aPhantom.clear();
            // do stuff with aPhantom
        }
    }
}

However, I am using Java 8, and in Java 8, phantom references are not automatically cleared when they are enqueued into the reference queue. (I know that this is fixed in Java 9, but unfortunately, I must use Java 8.) This means that, once the GC determines that the main object is not strongly reachable, and enqueues the phantom reference, it still cannot reclaim the memory of the main object, until I manually clear the phantom reference after I dequeue it in checkDiscarded(). I am concerned that, during the period of time between the GC enqueuing the phantom reference and me dequeueing it from the queue, the main object will remain in memory when it's unnecessary. My main object references many other objects which take a lot of memory, so I would not want it staying in memory for longer than without this feature.

To avoid this problem of the phantom reference preventing the main object from being reclaimed, I came up with the idea of using a dummy object as the referrent of the phantom reference instead of my main object. This dummy object will be referenced from my main object, so it will become unreachable at the same time as my main object. Since the dummy object will be small, I don't mind it not being reclaimed for longer period of time, as long as my main object will be reclaimed as soon as it's not reachable. Does this seem like a good idea, and is it really better than using the main object as the referrent?

class MainObject {
    static class MyPhantom extends PhantomReference<Object> {
        static Set<MyPhantom> phantomSet = new HashSet<>();
        MyPhantom(Object obj, ReferenceQueue<Object> queue) {
            super(obj, queue);
            phantomSet.add(this);
        }
        void clear() {
            super.clear();
            phantomSet.remove(this);
        }
    }
    Object dummyObject = new Object();
    MyPhantom myPhantom = new MyPhantom(dummyObject, referenceQueue);
    static ReferenceQueue<Object> referenceQueue = new ReferenceQueue<>();
    void markUsed() {
        myPhantom.clear();
        myPhantom = null;
    }
    static void checkDiscarded() { // run periodically
        while ((aPhantom = (MyPhantom) referenceQueue.poll()) != null) {
            aPhantom.clear();
            // do stuff with aPhantom
        }
    }
}

Another idea I am considering is to use weak references instead of phantom references. Unlike phantom references in Java 8, weak references are cleared when they are enqueued, so it does not prevent the referrent from being reclaimed. I understand that the reason why phantom references are usually used for resource cleanup, is that phantom references are only enqueued after the referrent is finalized and guaranteed to not be used anymore, whereas weak references are enqueued before being finalized, and so resources cannot be freed yet, and also the finalizer might resurrect the object. However, that's not a concern in my case, as I am not "cleaning up" any resources, but just making a report that my main object was discarded before being used, which can be done while the object is still in memory. My main objects also do not have a finalize() method, so there is no concern of resurrecting the object. So do you think weak references would be a better match for my case?


Solution

  • Weak and phantom references are indeed equivalent when no finalization is involved. However, a common misconception is to assume that an object is only subject to finalizer reachability and potential resurrection when its own class has a finalize() method.

    To demonstrate the behavior, we may use

    Object o = new Object();
    ReferenceQueue<Object> q = new ReferenceQueue<>();
    Reference<?> weak = new WeakReference<>(o, q), phantom = new PhantomReference<>(o, q), r;
    // ...
    o = null;
    for(int cycles = 0, got = 0; got < 2; ) {
        while((r = q.remove(100)) == null) {
            System.gc();
            cycles++;
        }
        got++;
        System.out.println(
            (r == weak? "weak": r == phantom? "phantom": "magic unicorn")
          + " ref queued after " + cycles + " cycles");
    }
    

    This typically prints either,

    phantom ref queued after 1 cycles
    weak ref queued after 1 cycles
    

    or

    weak ref queued after 1 cycles
    phantom ref queued after 1 cycles
    

    as both references are truly treated the same in this case and there’s no preferred order when both are enqueued in the same garbage collection.

    But when we replace the // ... line with

    class Legacy {
        private Object finalizerReachable;
    
        Legacy(Object o) {
            finalizerReachable = o;
        }
        @Override
        protected void finalize() throws Throwable {
            System.out.println("Legacy.finalize()");
        }
    }
    new Legacy(o);
    

    The output changes to something like

    Legacy.finalize()
    weak ref queued after 1 cycles
    phantom ref queued after 2 cycles
    

    as Legacy’s finalizer is enough to make the the object finalizer reachable and open the possibility to resurrect the object during finalization.

    This doesn’t have to stop you from using this approach. You may decide that there’s no such finalizer in your entire application or accept this scenario as known limitation, to only apply if someone intentionally adds such a finalizer. JDK 18 has marked the finalize() method as deprecated, for removal, so this issue will disappear in the future without requiring you to take any action.


    Still, your other approach using a dummy object with a PhantomReference will work as intended, having the phantom reference only enqueued when the dummy object and hence, also the outer object, is not even finalizer reachable anymore. The drawback is the (very) slightly higher memory consumption due to the additional dummy object.

    Mind that the markUsed() method may set dummyObject to null to.


    Another possible point of view is that when your feature is intended to log a wrong usage of your class, which should normally not happen, it doesn’t matter when it might temporarily consume more memory when it happens. When markUsed() has been called, the phantom reference is cleared and left to garbage collection without getting enqueued, so in the case of a correct usage, the memory is not held longer than necessary.